You appear to be offline. Some site functionality may not work.

Clearing cache in the browser

May 2, 2018 in Performance

If you’re a web developer, you, like me, have probably reached that moment in your career when you accidentally shipped a bad release of a front-end asset. And you gave it a cache lifetime of 30 years. Bad news. Your users are screwed until they manually clear their cache. Or are they?

It turns out, there are lots of ways of dealing with this without having to give your asset a short TTL. The added benefit is this also gives you the ability to plan to update your assets quickly, even when you don’t have a bad release or a problem. In all of these solutions, I’m assuming you know the URL of the asset you want to purge, and that your app is still making at least some kind of request to your server for something in which we can embed executable JavaScript, so either a script or an HTML page.

location.reload(true)

Your first solution is one that Steve Souders and Stoyan Stefanov came up with in 2012, and it takes advantage of the fact that the reload() method of the location object takes a forcedReload boolean param, which MDN notes:

Is a Boolean flag, which, when it is true, causes the page to always be reloaded from the server.

Will it load all the page’s resources from the server regardless of whether they are currently in cache, or just the top document?

Since you don’t want to interfere with what the user is doing by visibly reloading the top level document, you’ll want to use an iframe for this. In a piece of script in the top level document, you can do this:

const ifr = document.createElement(‘iframe’);
ifr.src = “/forcereload?path=/thing/stuck/in/cache”;
ifr.classList.add(“hidden-iframe”);
document.body.appendChild(ifr);

Then, in the /forcereload response:

<iframe src=”/thing/stuck/in/cache”></iframe>
<script>
  if (!location.hash) {
    location.hash = “#reloading”;
    location.reload(true);
  } else {
    location.hash = “#reloaded”;
  }
</script>

To make this work, you have to create an iframe, load an HTML document unrelated to the thing we want to invalidate, then load it again, along with also twice loading the thing to invalidate (although the first of those will be from cache). This is pretty bad. Added to all that, you’re left with an iframe attached to the document that you’ll want to clean up somehow, probably with a postmessage from the frame up to the parent to tell it that it can now remove the frame. And as Philip Tellis points out, an ancient but non-auto-updating version of Firefox will go into an infinite reload loop.

Turns out, this doesn’t even behave the way you might think it does anyway. The forcedReload argument, as documented by MDN, isn’t technically part of the spec for the location interface, and no browser changes when it performs a network fetch (at least in relation to subresources) based on the value of that argument. However, browsers do exhibit different behaviour for reload() itself. Chrome always loads the subresource from cache. Firefox, Edge and Safari always load it from the network.

The only effect the forcedReload argument has, seems to be:

  1. In relation to the document itself (the ‘reloader’ iframe in our technique), forcedReload prompts this to be fetched over the network in Firefox if it would otherwise be fetched from cache. All other browsers always reload the document from the network.
  2. In relation to subresources (like the script we’re trying to update), if the browser makes a network request for the reload (all except Chrome), then setting forcedReload will prevent conditional requests being made if any of the resources being reloaded have ETag or Last-Modified headers. In Chrome, there’s no impact of forcedReload here – either way, no network fetch is made.

Another disadvantage of this technique is that there’s realistically no way of preventing spurious entries being added to the browser history.

This is the solution Steve Souders uses, and the test case he created for it in 2012 doesn’t work today in Chrome, confirming what I found in my testing. It seems we can put this down to a change in Chrome’s behaviour. Since this argument is not in the spec, it’s not technically a bug but I can imagine people might have implementations of this technique in the wild and it’s a shame it no longer works.

Vary + fetch

Let’s move on to a potentially better option. I’m a bit obsessed with the Vary header, and I think we can use it here. All browsers implement it, and they use it as a validator, not as a cache key, which means that if a varied header value changes, the existing cached object will be invalid for the new request, and any new object downloaded will replace the object already in cache (this behaviour differs from CDNs and other “shared” caches, which will store multiple variants of the same URL).

So let’s set a Vary header on all responses from the server, varying on something that doesn’t exist:

Vary: Forced-Revalidate

This will have no effect because browsers don’t send a Forced-Revalidate header. But fetch can:

await fetch(“/thing/stuck/in/cache”, {
  headers: { “Forced-Revalidate”: 1 },
  credentials: “include”
});

So, what is happening here?

  1. You make a request for /thing/stuck/in/cache, and it finds a hit in the cache, but the cached object is varying by Forced-Revalidate with a key of “” (empty string). The new request carries a Forced-Revalidate value of 1, so it doesn’t match. You also include credentials with the request to ensure that the response can be used for a normal navigation request.
  2. The request is sent to the network. The server returns the new version of the file and still includes Vary: Forced-Revalidate
  3. The browser overwrites the existing cache item with the new one, which is now only valid for requests that have a Forced-Revalidate: 1 header .

But wait. Now the item in the cache will only match future requests that have a Forced-Revalidate header. The next time the browser has an ordinary reason to load this file, as a navigation or a subresource, it won’t send the special header, and you’ll miss the cache again. However, this time, the downloaded response will have a vary key of “” (empty string) and is back to being useful.

This is better; now Edge, Chrome, Firefox and Safari all behave correctly for same-origin resources. Firefox splits the cache for cross-origin fetches vs navigations, so it won’t clear the navigation cache. And it’s possible that in the future, browsers will start to store multiple variants, making this technique ineffective. Still, with one line of JavaScript and a slightly weird bit of HTTP metadata, you end up having to load the item twice, but there’s no iframe and this code is pretty maintainable.

Of course, ideally there’d be something you could put instead of headers: { “Forced-Revalidate”: 1 } to just tell fetch to skip the cache directly…

fetch + cache:reload

Which brings me to the cache property of the Fetch API’s Request object. This is easily the most simple and ideal way to solve the problem:

await fetch(‘/thing/stuck/in/cache’, {cache: ‘reload’, credentials: ‘include’});

The ‘reload’ cache mode tells fetch to ignore the cache and go directly to the network, but to save any new response into the cache. As before, you include credentials so that the fetch is (supposedly) treated the same as a normal navigation for caching purposes. The new response is immediately usable for any future requests, and you don’t need any crazy headers or iframes or anything.

Sounds perfect! Well, right now this works in Edge, Firefox and Safari, and Chrome is nearly there (it works perfectly in Canary, but hasn’t made it to stable yet). Support for this for same-origin resources is much better than I expected, actually, and MDN’s support table was out of date, so this has probably landed in Safari and Edge very recently.

And yet… In Safari, this will only clear the fetch cache, and while navigations can populate fetch cache, the reverse is not true. Also, Edge is the only browser to support this cross-domain.

fetch + POST

Time to roll out some bigger guns. POST requests invalidate cached content for that URL:

A cache MUST invalidate the effective Request URI (Section 5.5 of [RFC7230]) as well as the URI(s) in the Location and Content-Location response header fields (if present) when a non-error status code is received in response to an unsafe request method.

The question is, do browsers honor this, and does the browser cache the response? Let’s see, using fetch to generate a programmatic POST request for the stuck URL:

await fetch(‘/thing/stuck/in/cache’, {method:’POST’, credentials:’include’});

You’ll have to live with a preflight request, because its an unsafe method and you’re including credentials. It also turns out that no browser caches the result of the POST, even though it is advertising itself as cacheable (or if they do, they don’t use it to satisfy a subsequent GET). So even if you do see an invalidation, it’s going to take a minimum of 3 requests to repopulate the cache.

With that caveat, Chrome and Edge do well here, with their single view of the cache producing an invalidation for both same and cross-origin content, both for fetch and navigations. Firefox and Safari follow the same pattern we’ve seen before, of splitting navigations and fetches into separate caches, so the POST clears the fetch cache, but if your stuck object is a subresource, you’re out of luck.

POST in an iframe

Oh well, in for a penny, in for a pound, so let’s throw a FORM into an IFRAME and do a POST in there. I know, I’m sorry. Desperate times…

const ifr = document.createElement('iframe');
ifr.name = ifr.id = 'ifr_'+Date.now();
document.body.appendChild(ifr);
const form = document.createElement('form');
form.method = "POST";
form.target = ifr.name;
form.action = ‘/thing/stuck/in/cache’;
document.body.appendChild(form);
form.submit();

There’s a few obvious side effects: this will create a browser history entry, and is subject to the same issues of non-caching of the response. But it escapes the preflight requirements that exist for fetch, and since it’s a navigation, browsers that split caches will be clearing the right one.

This one almost nails it. Firefox will hold on to the stuck object for cross-origin resources but only for subsequent fetches. Every browser will invalidate the navigation cache for the object, both for same and cross origin resources.

Clear-Site-Data

We started ugly, found perfection, and then discovered perfection wasn’t all it was cracked up to be, and ended up ugly again. So it seems apt to end this story with an option that could be subtitled “nuke it from orbit.”

Meet Clear-Site-Data, the new web developer’s weapon of mass destruction.

No matter what URL you want to purge, you can simply return this response header in response to ANY request on the target origin:

Clear-Site-Data: “cache”

And bang, your cache is gone. And not just the thing you wanted to purge either. The entire cache for your origin is toast. Which might just save your bacon in a pinch.

Another advantage of this method is that you don’t need to be in a position to run any client side JavaScript, so you can even send this in response to an image or stylesheet request. It’s glorious in its lack of sophistication and brutal efficacy.

Discussions of this feature go back several years but it’s just now starting to appear in Chrome, though at time of writing, it’s been temporarily disabled due to… reasons. So it doesn’t work in any browser right now. Boo.

Conclusion

In summary, in what situations do browsers make network requests that invalidate the cache used by subresources?

Technique Firefox Safari Edge Chrome
location.reload doc, forceReload, same-origin Yes Yes Yes Yes
doc, normal, same-origin No Yes Yes Yes
doc, forceReload, cross-origin Yes Yes Yes Yes
doc, normal, cross-origin No Yes Yes Yes
resource, forceReload, same-origin Yes Yes Yes No
resource, normal, same-origin Varies [1] Yes Yes No
resource, forceReload, cross-origin Yes Yes Yes No
resource, normal, cross-origin Varies [1] Yes Yes No
Vary + fetch same-origin Yes Yes [3] Yes Yes
cross-origin No [2] Yes [3] Yes Yes
cache:reload same-origin Yes No [4] Yes Yes [5]
cross-origin No [2] No [4] Yes Yes [5]
Fetch + POST same-origin Yes No [4] Yes Yes
cross-origin No [2] No [4] Yes Yes
Iframe + POST same-origin Yes Yes Yes Yes
cross-origin Yes [6] Yes Yes Yes
Clear-Site-Data No No No No

[1] Hits network unless resource has Cache-Control: immutable
[2] Splits fetch/navigation caches for foreign origins, so will not clear the navigation cache
[3] The fetch will invalidate both navigation and fetch caches but a subsequent fetch will not re-populate the navigation cache.
[4] Does not clear the navigation cache, only the fetch cache
[5] Supported in Chrome Canary today
[6] Does not clear fetch cache

There are other caches and storage capabilities in the browser which I don’t address here, such as the Service worker Cache API, but I focused here on dealing with the cache that you target with Cache-Control HTTP headers. Clearing other kinds of storage merits another post for another day!

So, in conclusion, if you want to invalidate a script or other subresource, use the Iframe + POST technique, which works in all browsers for both same-origin and cross-origin.

The “correct” way is really cache:reload, so hopefully Safari and Firefox will change their behaviour in future to allow that technique to be more practically useful.

Performance

You may also like:

Author

Andrew Betts | Web Developer and Principal Developer Advocate

Andrew Betts is a Web Developer and Principal Developer Advocate for Fastly, where he works with developers across the world to help make the web faster, more secure, more reliable, and easier to work with. He founded a web consultancy which was ultimately acquired by the Financial Times, led the team that created the FT’s pioneering HTML5 web app, and founded the FT’s Labs division. He is also an elected member of the W3C Technical Architecture Group, a committee of nine people who guide the development of the World Wide Web.

triblondon