Server-sent events with Fastly

Server-sent events allow web servers to push real-time event notifications to the browser on a long-lived HTTP response. It's a beautiful part of the web platform that’s severely underused by websites, and whether it’s flight departures, stock prices, or news alerts, with Fastly you can efficiently broadcast events with low latency, to thousands or even millions of users simultaneously.

Here’s a simulation I put together of an airport departure board using real-time events:

So that’s pretty cool, and thanks to server-sent events (SSE), very simple. Even in the early days of the web, people saw the value of creating “live” content; server-sent events evolved out of this need and replaced some inventive, but horrifically convoluted solutions.

Things were pretty bad

In 2006, I was running a consulting business which was doing a project for the Financial Times, and we needed to push real-time, low-latency data to the browser for a live chat feature called Markets Live (which still runs every weekday at 11 AM in London). The chat is between two journalists commentating on the markets, and needs to be broadcast to everyone watching on the web. At the time there was no mechanism on the web that would do this, and sites would often use “polling,” i.e., issuing an XMLHttpRequest every second to see if there was new content.

Now this is fine, if you're the sort of person that likes orchestrating a denial of service attack against yourself, but I needed something a bit more efficient that would also scale to potentially tens of thousands of concurrent viewers, without prompting 10,000 requests per second to hit my server.

Enter a sackful of hacks that came to be known as comet, a term coined by Alex Russell. Essentially the idea was to dribble an HTTP response out to the browser very slowly, making it look like a file download but actually sending a stream of events. This relied on the browser's ability to render an HTTP response progressively (as it is being downloaded). Even at the time, all browsers supported this, but with a now-hilarious smorgasbord of quirks.

In Firefox, you could use an XHR's interactive event, which triggered each time more data was received (well done Mozilla!). For Opera and Safari, the body of an XHR was not accessible until the response was done, so we had to load the content in an IFRAME and send chunks of something akin to <script>top.receiveData({...});</script>. And then there was Internet Explorer.

This was the era in which Microsoft thought it was a good idea to emit clicking noises when navigating (which included IFRAMEs) and to keep spinning a huge IE logo in the top right of the browser window until the page had “finished loading.” The solution was an ActiveX control that created an in-memory page, in which you could then in turn load an endless IFRAME. I’m not even kidding about this, this is quite possibly the second ugliest hack I’ve ever implemented.

The point is, we went to a lot of effort to do something that today is easy, because all this hacking precipitated the development of the server-sent events API.

OK, so now things are mostly great

The principle of using a text-based HTTP response as a streaming transport is actually very simple and fairly easy for most people to set up on the server side. What we needed was a standard way for browsers to fire a JavaScript event every time a chunk of content is received on a slowly-loading response. This is the API that server-sent events provides:

(new EventSource("/stream")).addEventListener("newArticle", e => {
  const data = JSON.parse(e.data);
  console.log(“A new article was just published!”, data);
});

The server responds to /stream requests (the URL in this example) with a feed of double-newline separated events:

id: 1
event: newArticle
data: {"foo": 42}
id: 2
event: newArticle
data: {"foo":123, "bar":"hello"}

Pretty simple, right? As fabulous as server-sent event streams are, they do suffer from a few drawbacks:

  1. Your server needs to hold open a lot of idle TCP connections, which can require some custom optimizations to server or network hardware configuration to scale properly

  2. Your application architecture needs to be able to generate content and serve it to multiple waiting connections, which can be challenging for request-isolated backend technologies like PHP.

We can solve both these issues by fronting your SSE stream with Fastly.

Supercharging SSE with Fastly

There are a number of Fastly features that make SSE work particularly well through our network:

  • Request collapsing allows a second request for the same URL that we’re already fetching from origin to join on to the same origin request, avoiding a second origin fetch for the URL (as long as the object is still 'fresh', see below!). For normal requests, this is intended to avoid the cache stampede problem, but for streams, it also acts to “fan out” one stream to multiple users.

  • Streaming miss allows the start of an origin response to be sent to waiting clients (maybe more than one, thanks to request collapsing) before we’ve received the entire content from the origin server. Since SSE streams send chunks of data at unpredictable intervals, it’s important that the browser receives each chunk as soon as it comes out of the origin server.

  • Shielding aggregates requests from all our edge POPs in one nominated POP that is physically close to your origin server, allowing request collapsing to happen at all the edge POPs and also at the shield POP, so that no matter how many clients you are streaming to, your origin should see only one request at a time.

  • HTTP/2 allows responses to be multiplexed on the same connection. Without this, an SSE stream over HTTP/1.1 would use up a TCP connection, reducing the amount of concurrency available to load other assets from the same origin domain. This used to be 4 connections per origin, and recently it is more commonly 8 or 16, but tying one of them up permanently is bad news. HTTP/2 solves that.

We do request collapsing automatically (unless you turn it off), but for requests to be collapsed, the origin response must be cacheable and still 'fresh' at the time of the new request. You don’t actually want us to cache the event stream after it ends; if we did, future requests to join the stream would just get an instant response containing a batch of events that happened over some earlier period in time. But you do want us to buffer the response as well as streaming it out, so that a cache record exists for new clients to join onto. That means your time to live (TTL) for the stream response must be the same duration as you intend to stream for. Say your server is configured to serve streams in 30-second segments (the browser reconnects after each segment ends): the response TTL of the stream should be exactly 30 seconds (or 29, if you want to cover the possibility of clock-mis-syncs):

Cache-control: public, max-age=29

It is very important that your origin server cuts off streams after this time has elapsed. If you serve an endless stream to Fastly, we will hold those connections forever and you will exhaust the maximum number of concurrent connections to origin. Don't ever serve endless downloads through Fastly.

Next you’ll need to enable streaming miss in VCL — add the following VCL snippets to your service:

Finally, shielding is simply a matter of choosing a shield node when you define your origin server, and H2 is enabled by default as long as you’re serving over HTTPS/TLS.

SSE fanout

With all these features working together, you can emit one single event stream from origin and have that stream delivered in real time to potentially thousands or millions of users.

In the process of making my flight demo I also made a NodeJS module to manage server sent events using a pub/sub architecture. Find it as sse-pubsub on npm.

Doesn’t it use a lot of power?

Yes, SSE connections keep a mobile device’s radio powered up all the time. You should avoid connecting a SSE stream on a device that has a low battery, or possibly avoid using SSE at all unless the device is plugged in. You can check battery level using the battery status API, though at time of writing, only in Chrome.

Why not just use websockets?

Websockets get far more attention than SSE, and they are certainly more powerful and flexible, and satisfy more use cases. But this is one of these totally inappropriate uses of the word “just” which frustrates me. I know of no one that uses websockets without a client abstraction like socket.io, and I know of no one that uses server-sent events with a client abstraction, because it’s so ludicrously simple that you have no need.

Saying “why not just use websockets” is a bit like hearing me say I’m going to watch an anime with subtitles and saying “Why don’t you just learn Japanese?” It’s not that learning Japanese is not worth doing, but it’s manifestly a harder solution.

Why not use web push?

Currently, web push has more limited browser support, and has some significant designed-in constraints on the developer: the user must explicitly consent to receive the pushes, the client-side integration is substantially more complex, it requires passing messages via third party distribution services, and sites are required to display a visual notification to the user when a push is received. On the plus side, web push notifications are very power efficient, so don’t need to be paused if the device is on battery.

In the future, web push is going to cover many more use cases, but it’s fundamentally designed for long-term relationships between browser and server, while server-sent events will remain a great solution for streaming to active pages.

Andrew Betts
Head of Developer Relations
Published

7 min read

Want to continue the conversation?
Schedule time with an expert
Share this post
Andrew Betts
Head of Developer Relations

Andrew Betts is Head of Developer Relations for Fastly, where he works with developers across the world to help make the web faster, more secure, more reliable, and easier to work with. He founded a web consultancy which was ultimately acquired by the Financial Times, led the team that created the FT’s pioneering HTML5 web app, and founded the FT’s Labs division. He is also an elected member of the W3C Technical Architecture Group, a committee of nine people who guide the development of the World Wide Web.

Ready to get started?

Get in touch or create an account.