The rise of event-driven content (or how to cache more at the edge)

Hooman Beheshti's picture
Hooman Beheshti
April 22, 2015

At Fastly, we've been talking about dynamic content for quite some time — specifically, ways to cache dynamic content. The premise has been that even though CDN users have been taught to think that dynamic content is never cacheable, it often is.

When we talk about this, we often start by putting definitions around what constitutes "dynamic" and what doesn’t. As we began to speak and write more about this topic, we noticed a few interesting things:

  • Using the word "dynamic" to refer to content that is both truly dynamic and also cacheable continues to cause some confusion. In fact, I made fun of this at a talk I did in Velocity Barcelona in 2014, saying that maybe we should call this type of content "static for unpredictably short or long periods of time and also maybe really dynamic.” (I thought it was funny, but it didn't get a lot of laughs.)

  • There has been a growing need to cache a class of content that seems dynamic, but really isn't. In other words, businesses continue to look for ways to cache more stuff.

In trying to deal with this, we asked Fastly customers what they thought. It became clear that “static” and “dynamic” were not specific enough to describe all the different types of content out there. We decided we shouldn’t be limited by traditional labels.

In this post and going forward, we’re going to take a step back, talk about the different types of content our customers are dealing with the most, and discuss how cacheable or uncacheable they are.

Let's start with some definitions:

Static content

Static content is generally well-understood: it’s content that doesn't change very frequently (and even if it does, it changes predictably). Images, CSS, and JavaScript fall into this category, as do plenty of other seldom-changing content used across sites and applications. We cache static content by using Cache-Control headers or granular configuration that applies time to live values (TTLs) to different kinds of objects. With static content, we know ahead of time how long we want to cache things.

This is the type of content that motivated the birth of the CDN, and CDNs handle static content as you'd expect: they cache and serve it from the edge to users who request it. Headers or configuration dictate lifetime of these objects for the CDN.

Dynamic content

Going forward, we're going to use this term to indicate content that is truly dynamic and totally uncacheable. Dynamic objects are unique every time and never the same twice. AJAX calls are often dynamic. Heavily personalized content — a user’s login or credit card transactions, for example — also falls into this category.

CDNs (or any caches, for that matter) can't cache dynamic content because it's never the same twice. Instead, it's a CDN's responsibility to deliver this type of content from the origin as quickly and efficiently as possible. If you've heard the term DSA (Dynamic Site Acceleration) before, this is the type of content it applies to. Rather than using the caching capabilities of the CDN, DSA relies on the ability for the CDN to deliver this content over long distances as fast as possible. There's a lot of technology around this, but I've given a high level overview of what this entails in a previous blog post.

Event-driven content

Event-driven content is an imposter. It looks like dynamic content, and we've been told for years that it is, in fact, dynamic and could never be cached. This isn't really true — the truth is that traditional CDNs have never had the right mechanisms and capabilities to cache this type of content. So it was a lot easier to call it dynamic, treat it as such, and leave it be.

Where dynamic content is truly uncacheable, event-driven content is actually cacheable, but unpredictably. Event-driven content is the type of content that is static for some unknown period of time, and then it may or may not change. It can change twice a month, once a year, or 25 times an hour. The key is that we don't know ahead of time.

Here are some examples of event-driven content:

  • Wiki pages — a wiki page can go for weeks without changing, and then multiple editors might rapidly update a page at once based on new information available about the given topic.

  • Sports scores — they’re constant for some time, and then a team scores and things change. The rate of change is unpredictable; scores for a basketball game, for example, change at a completely different pace than a soccer game.

  • News articles — articles on a media site are also often event-driven, especially if they're being updated frequently.

  • Comments — user-submitted comments on a blog post or news article are often frequently changing.

  • Inventory levels — inventory content is constant for short periods of time and then can unpredictably change.

  • Stock prices — stock prices are static for short periods of time, with frequent changes (except for the 10+ hours a day where they don't change at all).

There's even content that looks static but is actually event-driven. Think of a product that is deployed as JavaScript on pages that you don't control. Every time there's a configuration change, the script changes. And when that change happens is totally unknown ahead of time.

Turns out, there is a lot of content that qualifies as event-driven, and because its static-ness is unpredictable, we've been told that it just isn't cacheable. In the past, if we did try to cache this type of content, we used very small TTLs and cached with the anxiety that there'd be a chance we'd have stale content out there for a period of time, albeit a short period of time.

With a modern CDN and the right mechanisms in place, event-driven content can easily be freed from its oppressive dynamic heritage to be cached with ease and without anxiety.

Caching event-driven content

To reiterate, the key to event-driven content is that it's unpredictably static. The "event" isn't referring to an earthquake or an election, or some other current event. The change is triggered by an action, be it user generated, server generated, or administrative. In either case, the content changes. The interesting part, though, is that when it changes, your application will very likely know about the change. So, since the knowledge or change trigger is available to the application, the solution becomes simple: cache event-driven content at the edge, and when it changes, “uncache” it immediately and programmatically which, in CDN-speak, means purge it instantly through an API.

It sounds fundamentally simple, but it isn't. To do this, your CDN needs to provide a couple core functionalities:

  • The CDN must have the ability to purge content on demand and instantly. Once a piece of content changes, you should be able to remove it from the CDN immediately so that it can be re-fetched from your origin server(s), ensuring your end users will see the most up-to-date version. Without instant purging, event-driven content is much, much harder to cache, if it can be cached at all.

  • Purging has to be programmatic, which means the CDN should provide you with a full-featured API in order to purge content quickly and granularly.

Instant purging through an API is part of our core feature set at Fastly, a feature set we’re continuously dedicated to enhancing (check out our recent introduction of Soft Purge).

Programmatic instant purging isn't the only thing you need to fully offload event-driven content to a CDN's caching infrastructure. As with dynamic content, application owners have traditionally been more comfortable serving event-driven content from the origin because of the critical visibility and analytics that requests to these types of content have provided them.

To truly cache event-driven content on a CDN, you’ll need access to real-time logs and analytics, along with historical stats; these are all vital mechanisms that are necessary to maintain the same level of visibility you’d get from your origin infrastructure.

Notice how "instant" and "real-time" are recurring themes? That's exactly what you should expect from a modern CDN in order to have the ability to cache your event-driven content. These are fundamental premises that we believe in strongly at Fastly.

Final thoughts

Event-driven content isn't new. Though it's growing, this type of content has been around for a long time. Since the mechanisms to properly cache it haven't been available, traditional CDNs have taught us to treat it like dynamic content and classify it as uncacheable. But now, the right tools are in fact available to properly cache this type of content, and we think it's important to distinguish it from content that is truly dynamic in nature.

Consider this an attempt at clarifying a confusing set of terminologies our industry has left us with. Labels and semantics aside, ultimately it’s Fastly’s job to help our customers get the most out of their CDN, which means listening to them and providing them with the right platform and technology to do just that.

Author