Surrogate Keys: Part 1

Here at Fastly HQ, we want websites to be fast. Caching is commonly used to speed up websites. However, caching rapidly changing and unpredictably updated content can be difficult. To make it easier, we built surrogate keys: a system that makes it possible to quickly purge related content.

Let’s walk through an example of how surrogate keys are useful. Imagine you’re building a web 2.0 picture-hosting website, and your customers notice that it’s running a bit slow. You know you have a huge problem that needs to be fixed right away. Latency kills your end-user experience and is the number one reason users leave.

In order to speed up the site, you decide to use Fastly.

A good start would be to cache picture pages by their URL (e.g., www.example.com/pic/id). If a picture’s information changes, you’ll need to remove the old version by sending a purge request to the Fastly API:

PURGE /pic/{id}
Host: example.com
Accept: */*

However, we have to be careful about how we cache these pages. If we display a user’s information next to their pictures, we need to purge all their picture pages if a user changes their information. You could send an individual purge for each picture, but that might take too long. Alternatively, you could cache the pages for a very short amount of time, but that would waste your servers’ resources. Surrogate keys solve this problem.

Before we delve too deeply into how, it’s useful to understand a bit about how Fastly works. Users send us requests for your content. If a user requests content that we haven’t cached, we make a request to one of your servers. Your server’s response might look something like this:

HTTP/1.1 200 OK
Content-Type: text/html
Connection: keep-alive
...

When your server responds to a request for uncached content, you can add an HTTP header field called Surrogate-Keys and include arbitrary space-delimited strings, for example:

HTTP/1.1 200 OK
Surrogate-Key: key1 key2 key3
Content-Type: text/html
...

This response includes three surrogate keys: key1, key2, and key3. When this response reaches our servers we strip out the surrogate keys and create a mapping from each key to the cached content.

Back to our example. In this case, you could add a surrogate key to all of a user’s picture pages (e.g., /user/542, /user/25). Whenever a user updates their information, you can purge all the pictures by sending us a purge for that user’s surrogate key, for example:

PURGE /service/id/purge/user/542
…

Instead of having to purge each picture individually, now you can update them all with just one request.

Some time later at the picture hosting website, you decide to build a browsing website. Since you recently built a mobile version of the site, you now have tons of images to cache and you have to make sure they load fast.

Whenever you make a change, you’ll need to purge the desktop version and the mobile version and the browsing version at the same time. In the future, you might need to build yet another version of the website, so you’ll need to be able to purge all the versions at once.

Again, using surrogate keys solves this problem. You can tag the different versions of a picture page with the same surrogate key (e.g., pic/76, pic/345), and purge them all at once.

Fast-forward: you’re working hard on your picture website–writing tests, adding features, crushing bugs – when you realize that you need a banner for the site ASAP.

Since you’re caching entire pages, just updating the header template isn’t enough. You’ll also need to purge all the pages that use this template. While you could purge every page on the site, there’s no reason to purge content that doesn’t use the header template.

Surrogate keys to the rescue! You can add surrogate keys for each template on a page (e.g. /templates/pic/show, /templates/pic/header, /templates/pic/comment). During deployment, you can check which templates have changed and only purge pages with modified templates.

These examples show how useful it is to be able to purge related content with surrogate keys. Surrogate keys makes it easier for you to cache more types of content — especially content that changes rapidly and unpredictably.

In Surrogate Keys: Part 2, we’ll talk about the implementation of the surrogate key system.

Published

3 min read

Want to continue the conversation?
Schedule time with an expert
Share this post

Tyler McMullen is CTO at Fastly, where he’s responsible for the system architecture and leads the company’s technology vision. As part of the founding team, Tyler built the first versions of Fastly’s Instant Purging system, API, and Real-time Analytics. A self-described technology curmudgeon, he has experience in everything from web design to kernel development, and loathes all of it. Especially distributed systems.

Ready to get started?

Get in touch or create an account.