You appear to be offline. Some site functionality may not work.
Sign Up

Fastly Blog

Common causes of a poor cache hit ratio and how to deal with them

The cache hit ratio (or hit ratio for short) is the ratio of hits to cacheable requests (hits and misses combined). There's also cache coverage, the ratio of cacheable requests to all requests (cacheable requests and passes). In most cases, you'll want both to be as high as possible, since misses and passes cause load on your origins, and are slower than cache hits.

In the last Varnish tip, I mentioned that there are some common causes for requests causing either a pass or a miss. In this tip, I'll cover them and give some possible solutions.

Cookies

In the default VCL, the Cookie and Set-Cookie headers will cause passes. In the case of the Set-Cookie header, this is for security reasons. If the Set-Cookie header were to contain a session ID, caching the response would cache the session ID, and hand it out to other clients. Seeing other people's shopping cart or personal info tends to scare away customers.

The Cookie header isn't as dangerous, but the default VCL for Varnish assumes that if there's a Cookie, then there might be content depending on it. The default VCL takes the safe path and refuses to cache anything in the presence of any cookies.

To cache regardless of a Cookie header, there are two options. The hard way is to make sure your website never sends any Set-Cookie headers, and that there are no scripts like Google Analytics setting cookies from the client side. Considering that is near impossible, let's cover the other way: removing the Cookie header from the request.

sub vcl_recv {
        ...
        unset req.http.Cookie;
        ...
}

Alternatively, you could copy the vcl_recv from the default VCL and leave out the part that does a return(pass); if a Cookie header is present.

This might be a little too general, so here's how you can refine it. Let's say you want to cache static content regardless of any sessions, and logged in users can be recognized by a username cookie. You would simply remove the cookie for all known paths for static content, or if username= is not found in the Cookie header.

sub vcl_recv {
        ...
        # If the URL starts with /images/, /js/ or /css/ we're dealing with
        # static content, and cookies don't matter.
        if (req.url ~ "^/(images|js|css)/") {
                unset req.http.Cookie;
        }

        # If the Cookie header doesn't contain "username=" then the user is
        # not logged in, and the cookies don't matter for _any_ content
        if (req.http.Cookie !~ "username=") {
                unset req.http.Cookie;
        }

        # Taken from the default VCL
        if (req.http.Cookie) {
                return(pass);
        }
        ...
}

Note: this article is about Varnish and its default VCL; Fastly's default VCL is different and does not pass for Cookie headers.

Since Set-Cookie is a response header, Varnish doesn't know until the origin responds whether or not to do a pass. This causes an issue with request collapsing. Request collapsing is a mechanism that — for cacheable content — enables Varnish to make only one request to your origin at a time, for a specific URL. If a response is not cacheable, it would cause serialization of requests, which in turn causes delays for your users. To prevent serialization, Varnish inserts an object in its cache with a flag that says to pass all requests for that specific URL. While that object is in the cache, all requests for that URL will be passed, and will not be serialized.

However, this means that if a Set-Cookie header is sent with your main page, then all requests for your main page will pass through from then on, until the object expires or is purged.

The simplest solution is to just remove any Set-Cookie headers.

sub vcl_fetch {
        unset beresp.http.Set-Cookie;
}

In some cases, you might want to let Set-Cookie through. For instance, if users can't log in, they will be unhappy. So identify what parts of your website are allowed to keep the Set-Cookie header, and exclude them. And to prevent any serialization at all, it is a good idea to make sure the pass is initiated from vcl_recv. If you don't pass from vcl_recv, there will be a short period of serialization every time the "hit-for-pass" object expires.

sub vcl_recv {
        # Pass through all requests for /login/* and /admin/*
        if (req.url ~ "^/(login|admin)/") {
                return(pass);
        }
        ...
}

sub vcl_fetch {
        # Allow /login/* and /admin/* to set cookies, but nothing else
        if (req.url !~ "^/(login|admin)/") {
                unset beresp.http.Set-Cookie;
        }
}

Cache busters

Cache busters use the fact that you can add any query string parameter to a URL, and as long as the request is for a static file, or the application doesn't know the parameter in question, the request still succeeds as if the parameter was not there. For instance, Apache will respond to http://www.example.com/site.css and http://www.example.com/site.css?cb=517 as if they were the same request.

Yet caches like Varnish see a new unique URL and will make a request to origin.

This is often abused by (third-party) developers to bypass any caching you might have, and to make sure their users have the freshest possible content.

To avoid this, you can simply cut off the query string:

sub vcl_recv {
        set req.url = regsub(req.url, "\?.*$", "");
}

This regsub() will replace any present question mark and everything after it.

Caching headers

The final usual suspects for this tip are the caching headers Expires and Cache-Control. If the origin response has an Expires with a date in the past, or a Cache-Control with the value max-age=0, the object will expire immediately. This is even worse than a Set-Cookie header, since the object will not be passed through, but will simply expire immediately. It will, in most cases, be served to requests that are waiting, due to grace, but any requests after a few seconds will again result in a miss.

In the case of Time To Live (TTL), changing the response headers in vcl_fetch has no effect. Varnish determines the TTL of the object beforehand, so to cache regardless of the headers, you set beresp.ttl instead.

sub vcl_fetch {
        # Cache /static/* and CSS files for 14 days
        if (req.url ~ "^/static/" || beresp.http.Content-Type ~ "^text/css") {
                set beresp.ttl = 14d;
        }

        # Cache images for 24 hours
        if (beresp.http.Content-Type ~ "^image/") {
                set beresp.ttl = 24h; # or 86400s or 1440m
        }
        ...
}

Keep in mind that the Expires or Cache-Control headers aren't changed by setting beresp.ttl and they will be passed on to the client. You will have to remove or replace these headers if that is not what you want.

Further help

If you are unsure why your requests aren't being cached, you can read more in the Varnish Users Guide, and specifically the chapter on performance.

If that doesn't help, there's a very active Varnish community you can ask. You can find more information about the IRC channel and mailing lists here. For Fastly specific questions, you can participate in the Fastly Community forum.

Varnish Performance

You may also like:

Author

Rogier Mulhuijzen | Senior Professional Services Engineer

Rogier “Doc” Mulhuijzen is a senior professional services engineer and Varnish wizard at Fastly, where performance tuning and troubleshooting have formed the foundation of his 18-year career. When he’s not helping customers, he sits on the Varnish Governance Board, where he helps give direction and solve issues for the Varnish open source project. In his spare time, he likes to conquer all terrains by riding motorcycles, snowboarding, and sailing.

drwilco