Fastly's observability and monitoring: empowering smart delivery and high performance | Fastly

Senior Manager, Product Management, Fastly

Chief Product & Strategy Officer, Fastly

February 28, 2022

Our observability features and capabilities — logging, metrics, and tracing — have always been a cornerstone of our Delivery, Security, and Compute products. That’s because we believe observability should be available to all our customers; we don’t think logging should be a restricted capability just for Enterprise-tier customers, unlike some providers.

In this post, we break down our current observability offerings and highlight some of the ways DevOps and SRE teams are using them to investigate anomalies, improve performance and up-time, and engage in observability-driven development.

So whether you're a cloud-native startup with observability front of mind, or you’re transitioning from legacy to hybrid multi-cloud, by the end of the post, you'll have a clear understanding of why engineering teams choose us as their trusted partner in innovation.

Logging

Our customers trust us to process internet-scale data, and we uphold that trust by building products that give them control. We want you to own your data, have the ability to log any aspect of the HTTP request and response (which you can configure via our API, Web Interface, or Command Line Interface), and have ownership of the logging destination.

Supported log integrations include six generic protocols, should you want to operate your own log receiver, and 19 third-party services for storage and analysis, including Amazon S3, Azure Blob Storage, Google BigQuery, Datadog, New Relic, and Splunk. Many third-party services not explicitly supported can also be used via our generic protocols and proprietary connectors, for which we have additionally documented five compatible integrations.

But control doesn’t stop there! You can also modify your logs at the edge using scripts that call our API whenever your log data meets predefined conditions. For example, you can enforce rate-limiting or blocklists by adding IPs to your versionless edge ACLs based on request information to automatically mitigate Layer 7 DDoS attacks. At The Guardian, for example, log streaming helps detect issues early after deploying changes to their site. And Foursquare determines what content is cached and what data is streamed. The list goes on, and we continue to see the benefits of real-time logging in innovative use cases.

“It’s so helpful to have more data and real-time log delivery at our fingertips. We can jump into the logs right away and get to the root cause of why something’s happening.” — Shopify

To make it easier for quick debugging of applications during the development process, the Fastly CLI provides live Log Tailing functionality, allowing developers to directly stream their own custom log messages within their terminal of choice to help test their applications running on our compute platform without having to configure and pay for any additional third-party log management services. Check out this blog post, in which Alex Kesler, Senior Software Engineer, explains how to get started with real-time logging and Compute@Edge.

View your stdout and stderr log output directly in your terminal with our live Log Tailing functionality

For customers delivering large-scale live events or streams, our Live Event Services provide insights into your live streaming performance and the ability to troubleshoot immediately — even if you’re using a multi-CDN strategy. And for customers wanting to get the most out of our services without tying up IT or engineering resources, we offer a Logging Insights Package. This professional services offering provides you with guided customization. After we’ve interviewed you to identify your specific business needs, we’ll write advanced queries and create customized dashboards for the logs stored in your logging endpoint.

Metrics

We offer a variety of ways to report on the performance and activity of your services. Our metrics APIs and dashboards provide real-time, per-second visibility and historical reporting.

Want to find out which geographical region had a spike in errors? Or how long a particular origin was unavailable for, and why? Or even curious how your cache hit ratio compares to previous years? Our metrics capabilities ensure our customers, developers, and partners can answer these types of questions quickly and with confidence.

Our 180 service-level metrics provide insight into the health of your application enabling you to understand everything from caching, object size, compute usage, image optimization, video and streaming, and more.

View your real-time logging usage from the dashboard

Our APIs can also be integrated into third-party services for monitoring and alerting, such as Datadog, New Relic, and Sumo Logic.

At our Altitude 2020 conference, one of our customers demonstrated how to monitor Fastly in about five minutes using the open-source fastly-exporter + Prometheus + Grafana, and how to make those insights easily accessible for more holistic, efficient problem-solving.

“Fastly lets us see results instantaneously, all over the world.” — Nic Benders, Chief Architect, New Relic

And whether you prefer dark mode to limit eye strain or simply for aesthetic reasons, our web interface controls allow you to seamlessly switch between dark and light modes via your account settings menu to suit your environmental needs.

Dark mode also accommodates accessibility issues around migraines, visual impairments, and eye fatigue.

Want even more metrics? Customers have an easy way to monitor every origin and every domain without needing to send log data to a third-party data collector with Origin Inspector and Domain Inspector.

Tracing

IT and DevOps teams can use tracing to debug and monitor distributed software architectures, such as Compute@Edge applications or microservices running on Fastly.

Compute@Edge honors request tracing parameters by maintaining them when they enter and leave our platform. Developers can tag individual end-user requests with unique identifiers, helping illuminate any blind spots in multi-technology infrastructures and providing details on the request’s lifetime. Users can pass this information along to third-party systems that help with data visualization and conduct further tailored analysis.

Working together, Epsagon — a company that monitors and troubleshoots microservice environments — and Adobe’s Project Helix team built a very cool integration using our real-time logging and edge programmability features that collects, batches, formats (as JSON), and ships tracing data off to Epsagon's HTTPS endpoint. Read more about what Lars Trieloff from Adobe and Ran Ribenzaft from Epsagon had to say.

Summary

We’ve broken down our current observability offerings and highlighted how our customers use them to troubleshoot issues more efficiently so they can focus on delivering great products and experiences. We’d love to hear where you are on your observability journey. Please share this post and comment to let us know.

Not yet a customer? Reach out, and we'll get in touch.