Sign up now to get free exclusive access to reports, research and invitation only events.
How the social networking giant uses open source tools to achieve its massive app scalablilty.
Fast throughput: memchached processes the equivalent of a 30 volume setof Encyclopedia Britannica in one tenth of a second.
To improve the performance of its PHP application Facebook developed HipHop for PHP, a tool for converting PHP into optimised C++. HipHop involves a seven stage process from parsing to compilation.
Move fast, have a huge impact and be bold. Those are the three central tenets of engineering at Facebook.
Facebook connects people (nodes) with other people, but it also connects people with common interests. Such relationships make scaling more complex than a typical Web application, where many people requst the same (or similar) data.
An open development environment means much of the production software began life as a simple hack by a small group of people.
Language agnostic: In addition to PHP, Facebook develops software in C++, Java, Python and Erlang. The philosophy is not to choose a single language when building infrastructure.
Facebook developed Haystack to reduce the number of fileserving I/O operations from 10 to one.
Facebook's storage layer is a cluster of MySQL servers -- if one or two go down the army continues to fight on.
Facebook has risen from a a niche social networking service among US universities to become one of the biggest sites on the Internet, connecting millions of people from all around the world. A visualisation of live friend requests is shown here. What is less known is [[artnid:337284|Facebook was built almost entirely from open source software|new]].
At a basic level, Facebook has a three-tier architecture with Web serving, memcached for in-memory data access and databases for persistent storage.
Scaling Facebook is all about fast access to interconnected data. To build a page, data needs to be pulled from multiple, disparate sources.
Logging a massive 25TB of data every day, the standard Linux Syslog just didn't cut it so Facebook developed a more scalable logging tool called Scribe. Scribe is now an open source project.
The data flow architecture at Facebook has the production Hive-Hadoop cluster at the centre. The commercial Oracle RAC product gets fed data from the main cluster.
Publicly available open source software enabled Facebook to grow at a rapid rate to sustain its exploding membership. Thinking of its roots, Facebook is also a good open source citizen and has started numerous open source projects of its own, which are not used by other social networking services. Facebook's open source software is available online at: [[xref:http://developers.facebook.com/opensource.php|http://developers.facebook.com/opensource.php|new]].
Facebook has contributed to the evolution of memcache technology.
Racks of servers: Facebook's Hadoop-Hive cluster is used by engineers and business staff for data analysis.
Facebook is adding photos at an astronomical rate -- 40 billion and counting. All photos are converted into four different sizes (for a total of 160 billion photos) before they are sent to Facebook's servers.