We use solid-state drives (SSDs) rather than regular hard drives to power our cache servers. This blog post will dive into our all-SSD approach, the economics behind our thinking, how we manage to have large caches without sacrificing performance, and why even a small increase in cache hit ratios can result in savings for you — and a much better experience for your end users.
Why we use solid-state drives
Fastly uses commodity servers; we don’t use any special parts or custom hardware, we just use regular components that you could buy from Best Buy or NewEgg (apart from the custom bezels on our servers).
However, this depends on a very broad definition of “commodity.” Each server has 384 GB of RAM, a whopping 6 TB of SSD space (made up from 12x 500 GB drives), and — just for some extra space — each CPU has 25 MB of L3 cache. A typical point of presence (POP) has 32 machines specced like this.
We’re often asked why we use SSDs, which are scaled-up versions of the flash drive found in digital cameras, since they’re so expensive. An old-school 250 GB hard drive, if you can get one, will set you back about $20, whereas a 250 GB SSD will be closer to $120.
The reason is that we don’t think in terms of dollars-per-gigabyte. We think in dollars-per-IOPS.
IOPS are Input/Output Operations Per Second or, in layman’s terms, how fast and how often we can read and write to the disk. Fetching items from cache and delivering them quickly is incredibly important for how fast your site is delivered. A typical hard drive can perform approximately 450 IOPS when reading a bunch of 4 KB files and 300-400 IOPS when writing. The SSDs Fastly uses, however, will execute somewhere in the region of 75,000 IOPS reading and 11,500 when writing.
When it comes to dollars-per-IOPS, SSDs beat the pants off regular drives, and allow Fastly to read an object off our disks in under a millisecond (actually, 95% of the time that number is under 500 microseconds).
Constant lookup times matter
A more important factor is the fact that no matter how large an SSD is, the time to fetch a piece of data (otherwise known as the lookup time) is constant. In contrast, the lookup time for an old hard drive is dependent on the size of the drive. This is because a hard drive is made up of several rotating platters covered in iron oxide (to put it another way, a hard drive is quite literally full of spinning rust). When the hard drive wants to fetch a piece of data, it has to move a read/write head to the correct circle of the platter and wait for the right spot to spin around underneath it, almost like the old record player. The larger the drive, the longer it takes on average to get a specific piece of information.
However, because SSDs are solid-state, they have no moving parts and we don’t have to wait for the disk platters to spin around and the read/write head to move. This means that the lookup time for an object is constant. So, no matter how large the drive, that average never goes up.
Why is this important? There are two main benefits:
Fastly can make our caches very large without sacrificing performance. The larger the cache, the more content we can put in it, and the longer we can keep it. Those two things combined mean you get a higher cache hit ratio, and a higher cache hit ratio means your site is going to get delivered faster.
Because we can fetch objects from disk faster, we can handle more requests per second on an individual machine. This means that Fastly needs fewer machines to handle a given level of traffic; the fewer the number of machines, the more likely an object being requested is going to be in the cache of the machine handling the request. Again, this leads to a higher cache hit ratio and a faster site.
Improving your cache hit ratio and reducing server costs
Let’s say you had a 90% cache hit ratio with your previous CDN. Then you switch to Fastly, and because of our larger caches, your cache hit ratio becomes 95%.
That 5% increase may not seem like much at first, but it’s actually a big deal. Instead of 10% of your traffic going back to origin, only 5% does. We’ve saved you 50% of your origin traffic.
The objects in the 50% we’ve saved you get served straight from cache in under a millisecond rather than going back to your origin — a network round trip of around 75-100 milliseconds within the US. Double that for Europe, and then probably another 500 milliseconds or more to serve. The net result is that that content gets sent back to the browser from our edge node 700 times faster.
When we serve more of your content from our caches, not only do you spend less on bandwidth, but you can also cut the number of origin servers you need to handle the cache misses. In the example above, you can cut the number in half, which saves you even more money. For example, HotelTonight saw an 80% reduction in traffic hitting their servers after switching to Fastly, allowing them to save on origin infrastructure expenditure.
The lifetime of objects in Fastly’s cache
We use a variety of techniques to get the most out of our hardware, one of which is an object clustering system that uses consistent hashing and primary and secondary machines within a POP to maximize performance and reduce cache misses. This architecture also has the benefit of allowing us to add more machines to the cluster, increasing storage with almost no drawback.
We also wrote a custom storage engine to effectively bypass the file system and squeeze every last drop of performance out of our SSDs, cramming as much data on them as possible. In addition, we use various algorithms to keep commonly used data in that 384 GB of RAM, making it even faster. For some assets, such as performance-monitoring pixels and “Like” or “Share” buttons that never change and are requested millions of times a second, we go even further by serving them out of the L3 cache directly from the processor.
The other thing about having large, powerful drives is that content can remain in the cache for a very long time, even if it’s not frequently accessed. The upshot is that we’re fast even for long tail content. If your origin servers go down for whatever reason, we can still serve your old content until you can bring them back up, giving you much needed breathing room to fix the problem. In some cases, your end users will never even notice the site is down.
When we designed Fastly's content delivery network, we spent a lot of time thinking through different variables, trying to balance cost, convenience, functionality, and speed. Our goal is to maximize efficiency without compromising performance for our customers.
The results might seem counterintuitive — that opting to use more expensive components would paradoxically end up being less expensive. But it only goes to show that starting from a clean design sheet and thinking about problems a little laterally, without needing to account for any legacy baggage, can deliver really unexpected results.
By spending more money upfront and avoiding legacy technology, we’re able to provide better service for customers and also pass down long-term savings. Dive deeper into this by watching our CEO Artur Bergman speak at the 2011 Velocity Conference in Santa Clara. Warning: the video contains frank views, strong language, Scandinavian accents, and a pair of Tiger embroidered Maharishi trousers.