Caching configuration best practices

This guide provides best practices for optimizing caching with Fastly, including strategic recommendations and links to detailed configuration guides.

Initial configuration best practices

Set up your caching foundation to ensure content is cached efficiently and securely. These practices help prevent common misconfigurations that can reduce cache hit ratios or create security vulnerabilities.

Integrate Fastly with your application platform

You can optimize caching with Fastly by customizing your application platform settings. For instructions, check out our documentation on integrating third-party services and configuring web server software. We also provide a variety of plugins to help you directly integrate Fastly with your content management system.

Enable Segmented Caching for large files

If your service delivers large files like videos, software downloads, or other resources over 20 MB, enable Segmented Caching. This feature breaks large resources into smaller segments for efficient caching and allows Fastly to handle byte-range requests without fetching entire files from origin. Segmented Caching is intended for static content only and has some limitations with other Fastly features, so review the full documentation before enabling it.

Optimize your cache control headers

Use cache control headers to set policies that determine the maximum amount of time your content may be cached. Fastly checks headers in this priority order: Surrogate-Control, Cache-Control: s-maxage, Cache-Control: max-age, then Expires. For more details on how these headers work, check out our guide to understanding cache control headers.

To improve caching performance, consider:

Increasing max-age values - Raise the time-to-live in your Cache-Control or Surrogate-Control headers to keep content cached longer.
Using Surrogate-Control for split policies - Set longer cache times for Fastly while maintaining shorter times for browsers (e.g., Surrogate-Control: max-age=3600 with Cache-Control: max-age=60).
Adding Surrogate-Key headers - Tag related content so you can purge entire collections of URLs in one API call via the Purge API.
Avoiding restrictive directives - Remove Cache-Control: private directives from responses you want Fastly to cache.

Properly configured cache control headers ensure your content can stay cached up to your specified maximum while giving you flexibility to invalidate it earlier by purging when needed.

Configure Fastly to temporarily serve stale content

Serving a slightly stale response may be preferable to paying the cost of a trip to a backend, and it's almost certainly better than serving an error page to the user.

Consider using the stale-while-revalidate and stale-if-error caching directives in your Cache-Control headers, or consider setting the beresp.stale_while_revalidate and beresp.stale_if_error variables in VCL services.

Our guide to serving stale content describes this in more detail. Learn more about staleness and revalidation.

Set fallback TTLs for edge cases only

Fallback TTLs apply only when your origin doesn't send cache control headers. As a best practice, configure appropriate Cache-Control headers on all responses from your origin servers instead of relying on fallback TTLs when controlling caching. Use fallback TTLs as a safety net for edge cases where headers are missing.

Ongoing optimization

Monitor and tune your caching performance as your service evolves. Regular optimization ensures you're getting the most value from Fastly's caching capabilities.

Use purge instead of short cache lifetimes

It's easy to purge a Fastly service, whether for a single URL, a group of tagged resources, or an entire service cache, and it takes only a few seconds at most. To increase your cache hit ratio and the responsiveness of your site for end users, consider setting a long cache lifetime when saving things into the Fastly cache. When content changes, send a purge request to clear the old content.

Decrease your first byte timeout

When making a request to a backend server, Fastly waits for a configurable interval before deciding that the backend request has failed. This is the first byte timeout and by default is fairly conservative. If you expect your backend server to be more responsive, you can choose to 'fail faster' by decreasing this value in conjunction with serving stale objects from the cache.

To decrease your first byte timeout, edit your host and update the First byte timeout field in the Advanced options section (e.g., 15000 milliseconds).

Use your cache hit ratio to diagnose potential caching problems

Check your cache hit ratio to diagnose caching problems. Your cache hit ratio shows what percentage of requests are served from cache versus hitting your origin. You can check it by viewing the Observability page for your service.

Aim for a 90%+ cache hit ratio. If your ratio is lower, consider investigating:

Cache-Control headers - Check if your origin is sending Cache-Control: no-cache or other headers that prevent caching.
Cache key fragmentation - Analyze your request logs for unnecessary query parameters or URL variations creating duplicate cache entries
Short TTLs - Review your TTL settings to ensure content isn't expiring too quickly
Unexpected traffic patterns - Monitor for bots, traffic spikes to uncached content, or new features that aren't caching properly

A consistently high cache hit ratio confirms your caching configuration is working as intended and reducing the load on your origin servers.

Avoid unnecessary cache key modifications

By default, Fastly uses the URL and Host header to create the cache key. Avoid modifying this unless necessary. Adding too much information reduces your cache hit ratio, while adding too little can cause caching across security domains. If you need to vary cached content based on request headers, use the Vary header instead of manipulating the cache key directly. Our guide on manipulating the cache key provides additional details.

Network services

Security

Compute

Quick start

Building blocks

Integrations

Tutorials

Demos

Use Cases

Code Examples

Starter Kits

Caching configuration best practices

Initial configuration best practices

Integrate Fastly with your application platform

Enable Segmented Caching for large files

Optimize your cache control headers

Configure Fastly to temporarily serve stale content

Set fallback TTLs for edge cases only

Ongoing optimization

Use purge instead of short cache lifetimes

Decrease your first byte timeout

Use your cache hit ratio to diagnose potential caching problems

Avoid unnecessary cache key modifications

Network services

Security

Compute

Quick start

Building blocks

Integrations

Tutorials

Demos

Use Cases

Code Examples

Starter Kits

Caching configuration best practices

Initial configuration best practices

Integrate Fastly with your application platform

Enable Segmented Caching for large files

Optimize your cache control headers

Configure Fastly to temporarily serve stale content

Set fallback TTLs for edge cases only

Ongoing optimization

Use purge instead of short cache lifetimes

Decrease your first byte timeout

Use your cache hit ratio to diagnose potential caching problems

Avoid unnecessary cache key modifications

Related content