Jeff Rothschild, VP for Technology at Facebook, shared some detailed insights into Facebook architecture. Over the past few years, Facebook has grown into one of the largest sites on the Internet today serving over 200 billion pages per month and with more than 300 million users. The nature of social data makes engineering a site for this level of scale a particularly challenging proposition. In this presentation, Jeff discussed the aspects of social data that present challenges for scalability and the core architectural components and design principles that Facebook has used to address these challenges. He also discussed emerging technologies that offer new opportunities for building cost-effective high performance web architectures.
Here’s the link to the webcast of his presentation.
Site Statistics:
- Facebook is the #2 property on the Internet as measured by the time users spend on the site.
- Over 200 billion monthly page views.
- >3.9 trillion feed actions proceessed per day.
- Over 15,000 websites use Facebook content
- In 2004, the shape of the curve plotting user population as a function of time showed exponential growth to 2M users. 5 years later they have stayed on the same exponetial curve with >300M users.
- Facebook is a global site, with 70% of users outside of the US.
- Today, there are 1.3B people in the world who have quality Internet connectivity, so there is at least another factor of 4 growth that Facebook is going after. Jeff presented statistics for the number of users that each engineer supports at a variety of high-profile Internet companies: 1.1M for Facebook, 190,000 Google, 94,000 Amazon, 75,000 Microsoft.
Photo sharing on Facebook:
- Facebook stores 20 billion photos in 4 resolutions
- 2-3 billion new photos are uploaded every month
- Originally provisioned photo storage for 6 months, but blew through available storage in 1.5 weeks.
- Facebook serves 600k photos/second –> serving them is more difficult than storing them.
Scaling photos, first the easy way:
- Upload tier: handles uploads, scales the images, sotres on NFS tier
- Serving tier: Images are served from NFS via HTTP
- NFS Storage tier built from commercial products
- Filesystems aren’t really good at supporting large numbers of files
Scaling photos, 2nd generation:
- Cachr: cache the high volume smaller images to offload the main storage systems.
- Only 300M images in 3 resolutions
- Distribute these through a CDN to reduce network latency.
- Cache them in memory.
Scaling photos, 3rd Generation System: Haystack
- How many IO’s do you need to serve an image? Originally, 10 I/O’s at Facebook because of the complex directory structure.
- Optimizations got it down to 2-4 IOs per file served
- Facebook built a better version called Haystack by merging multiple files into a single large file. In the common case, serving a photo now requires 1 I/O operation. Haystack is available as open source.
Facebook architecture consists of:
- Load balancers as front end requests are distributed to Web Servers retrieve actual content from a large memcached layer because of the latency requirements for individual requests.
- Presentation Layer employs PHP
- Simple to learn: small set of expressions and statements
- Simple to write: loose typing and universal “array”
- Simple to read
But this comes at a cost:
- High CPU and memory consumption.
- C++ Interoperability Challenging.
- PHP does not encourage good programming in the large (at 3M lines of code it is a significant organizational challenge).
- Initialization cost of each page scales with size of code base
Thus Facebook engineers undertook implementing optimizations to PHP:
- Lazy loading
- Cache priming
- More efficient locking semantics for variable cache
- Memcache client extension
- Asynchrnous event-handling
Tags: Architecture, Data Center, Opensource, Technology