Caching configuration best practices
This guide provides best practices for optimizing caching with Fastly, including strategic recommendations and links to detailed configuration guides.
Initial configuration best practices
Set up your caching foundation to ensure content is cached efficiently and securely. These practices help prevent common misconfigurations that can reduce cache hit ratios or create security vulnerabilities.
Integrate Fastly with your application platform
You can optimize caching with Fastly by customizing your application platform settings. For instructions, check out our documentation on integrating third-party services and configuring web server software. We also provide a variety of plugins to help you directly integrate Fastly with your content management system.
Enable Segmented Caching for large files
If your service delivers large files like videos, software downloads, or other resources over 20 MB, enable Segmented Caching. This feature breaks large resources into smaller segments for efficient caching and allows Fastly to handle byte-range requests without fetching entire files from origin. Segmented Caching is intended for static content only and has some limitations with other Fastly features, so review the full documentation before enabling it.
Optimize your cache control headers
Use cache control headers to set policies that determine the maximum amount of time your content may be cached. Fastly checks headers in this priority order: Surrogate-Control
, Cache-Control: s-maxage
, Cache-Control: max-age
, then Expires
. For more details on how these headers work, check out our guide to understanding cache control headers.
To improve caching performance, consider:
- Increasing
max-age
values - Raise the time-to-live in yourCache-Control
orSurrogate-Control
headers to keep content cached longer. - Using
Surrogate-Control
for split policies - Set longer cache times for Fastly while maintaining shorter times for browsers (e.g.,Surrogate-Control: max-age=3600
withCache-Control: max-age=60
). - Adding
Surrogate-Key
headers - Tag related content so you can purge entire collections of URLs in one API call via the Purge API. - Avoiding restrictive directives - Remove
Cache-Control: private
directives from responses you want Fastly to cache.
Properly configured cache control headers ensure your content can stay cached up to your specified maximum while giving you flexibility to invalidate it earlier by purging when needed.
Configure Fastly to temporarily serve stale content
Serving a slightly stale response may be preferable to paying the cost of a trip to a backend, and it's almost certainly better than serving an error page to the user.
Consider using the stale-while-revalidate
and stale-if-error
caching directives in your Cache-Control
headers, or consider setting the beresp.stale_while_revalidate
and beresp.stale_if_error
variables in VCL services.
Our guide to serving stale content describes this in more detail. Learn more about staleness and revalidation.
Set fallback TTLs for edge cases only
Fallback TTLs apply only when your origin doesn't send cache control headers. As a best practice, configure appropriate Cache-Control
headers on all responses from your origin servers instead of relying on fallback TTLs when controlling caching. Use fallback TTLs as a safety net for edge cases where headers are missing.
Ongoing optimization
Monitor and tune your caching performance as your service evolves. Regular optimization ensures you're getting the most value from Fastly's caching capabilities.
Use purge instead of short cache lifetimes
It's easy to purge a Fastly service, whether for a single URL, a group of tagged resources, or an entire service cache, and it takes only a few seconds at most. To increase your cache hit ratio and the responsiveness of your site for end users, consider setting a long cache lifetime when saving things into the Fastly cache. When content changes, send a purge request to clear the old content.
Decrease your first byte timeout
When making a request to a backend server, Fastly waits for a configurable interval before deciding that the backend request has failed. This is the first byte timeout and by default is fairly conservative. If you expect your backend server to be more responsive, you can choose to 'fail faster' by decreasing this value in conjunction with serving stale objects from the cache.
To decrease your first byte timeout, edit your host and update the First byte timeout field in the Advanced options section (e.g., 15000 milliseconds).
Use your cache hit ratio to diagnose potential caching problems
Check your cache hit ratio to diagnose caching problems. Your cache hit ratio shows what percentage of requests are served from cache versus hitting your origin. You can check it by viewing the Observability page for your service.
Aim for a 90%+ cache hit ratio. If your ratio is lower, consider investigating:
- Cache-Control headers - Check if your origin is sending
Cache-Control: no-cache
or other headers that prevent caching. - Cache key fragmentation - Analyze your request logs for unnecessary query parameters or URL variations creating duplicate cache entries
- Short TTLs - Review your TTL settings to ensure content isn't expiring too quickly
- Unexpected traffic patterns - Monitor for bots, traffic spikes to uncached content, or new features that aren't caching properly
A consistently high cache hit ratio confirms your caching configuration is working as intended and reducing the load on your origin servers.
Avoid unnecessary cache key modifications
By default, Fastly uses the URL and Host header to create the cache key. Avoid modifying this unless necessary. Adding too much information reduces your cache hit ratio, while adding too little can cause caching across security domains. If you need to vary cached content based on request headers, use the Vary
header instead of manipulating the cache key directly. Our guide on manipulating the cache key provides additional details.