When things go right, nobody notices; it’s when they go wrong that everyone pays attention — and that’s especially true when it comes to live broadcasting. Interrupted streams, jitter, authentication problems: these are the things nightmares are made of.
At Fastly, we’ve helped some of the world’s leading broadcasters and content owners deliver live events that scale from local news to the very largest sporting events — and we’ve learned a thing or two along the way. Check out our seven best practices for live streaming success:
Double down on observability
Real-time observability is essential to be able to react when something goes wrong. To give yourself the best possible odds for a successful stream, work with vendors, including CDNs, across your stack that offer instant access to log files. There is no do-over in live streaming, and you need to be able to see critical metrics live in order to react and resolve issues as quickly as possible.
Another essential element of observability is cutting through the noise. There is such a wealth of data available that you have to decide up front what you want to watch and take action on. Different data points lend themselves to different types of insights, and as you gain experience, it’s easier to choose which metrics to prioritize. Until then, we recommend you work with vendors that can guide you on the variances. For example, should you care more about time to first byte, rebuffering events, error counts, or latency? Once determined, you can easily create thresholds that will notify you if a certain threshold is reached. For mission critical events, where the stakes are high, we recommend involving a third-party to help you watch for issues throughout the stream.
Eliminate single points of failure
Doubling down is nothing new in live broadcasting, and equipment failure has been around as long as, well, equipment. When it comes to live streaming, redundancy will go a long way to ensure success. By combining several CDNs for a multi-CDN architecture, you minimize the effect of an outage while ensuring your content will reach your audience via a larger, stronger, more resilient network.
Not only will you have a backup network that you can easily switch to, but this approach also lets you split traffic among several CDNs and reroute traffic around congestion points. But remember: Delivery is only one part of a live transmission. Eliminating a single point of failure extends from capture devices over encoders to players, and a comprehensive plan must have built-in redundancy as close to the last mile as possible. The origin is no exception, and content owners can elect to send the captured streams to two or more origin servers. With this setup, one server can instantly take over or assist, should the other server be overloaded or experience a hardware failure.
Reaching a global audience with ease is one of the many benefits of IP-delivery. There can, however, be several reasons why you would want to limit the reach of the content. For example, licensing could prevent you from delivering content in certain countries or regions. As you build your delivery strategy, make sure capabilities like authentication, paywall, and rights management can be instantly configured and enforced as close to your viewers as possible. Trying to solve for the restrictions further up the chain can possibly overburden your origin infrastructure, while implementing it at the player level may result in content hijacking.
Make an accurate traffic projection
It may seem impossible to predict the size of your streaming audience, but making a best effort can be beneficial. Estimating your viewership can inform your traffic allocation plan during a live event.
If you’re using a multi-CDN approach, are you prepared to reallocate traffic between CDNs mid-stream? Even a single-vendor edge delivery strategy needs to include a process for rerouting client requests to maintain the best quality of experience possible. This can be done at the DNS, CDN, or anycast layer. However, before you can make a change, you need real-time data from the data feed to the player, which ties back to observability preparedness. Be sure all your components in the video workflow can provide this level of visibility at scale.
The number of concurrent requests based on viewership has broad implications, including the contractual cost of delivery and inclusion of redundancies in your video workflow. Guessing too high on concurrent request volume could lead to overspending and the addition of unnecessary complexity. Guess too low, and you could be subject to expensive overage pricing above your contractual commitments, not to mention potentially providing a sub-optimal streaming experience for your viewers.
On the other hand, broadcasters have to contend with changes in the number of viewers along with the ratio of viewers using traditional cable and over-the-air broadcast vs OTT. Luckily, this task is easier for digital pure-plays, where you can look at viewership numbers from past events and increase/decrease expectations relative to the base number. We recommend taking the expected overall viewership and multiplying this by OTT vs. traditional broadcast ratios for your audience. This is by no means bulletproof, and there isn’t a perfect way to predict the number. Practice in making these estimates helps you improve accuracy.
Test and harden your stack
It goes without saying that you need to be as prepared as possible before your live event. You need to test and harden your entire video delivery stack to avoid surprises on the big day. This includes load testing and implementing configuration changes on-the-fly for all of your vendors. Whenever possible, practice on smaller events beforehand. For example, many professional sports leagues offer regional and playoff games that are great opportunities to practice your workflow, configurations, and processes. The data gathered will be useful for fine-tuning delivery before the big game.
Create an operational contingency plan
If you don’t have an operational contingency plan in place, make it a priority to develop one. It’s much easier to execute and react when you have defined processes rather than trying to respond during a live event. Don’t wait until the post-mortem; consider doing a “pre-mortem” brainstorm to identify potential issues.
Make sure your contingency plans are documented, vetted, disseminated and practiced more than once in order to be utilized effectively. What is the process for load balancing requests across CDNs? How will you monitor real-time data, quickly identify issues, determine where they reside, and how to resolve them? It’s also important to create instant messaging channels with the third-parties involved, so you can easily react and make sure the operations team has the insight required to make the best possible decisions.
Execute your plan with actionable insights
The steps above are just suggestions. No two live events are exactly alike, and even weekly sporting events can vary widely in terms of expected number of viewers and the location of the audience and fan base. Architect your infrastructure so it can easily expand when needed, but also make sure there’s a constant flow of data points that will — in near real-time — monitor the heartbeat of the stream. We also recommend you seek to minimize the origin requests as those can be costly, and make sure your backup plan is as solid as the primary architecture for your live stream.