Plateforme Edge Cloud de Fastly

Revenir au blog

Follow and Subscribe

Disponible uniquement en anglais

Cette page n'est actuellement disponible qu'en anglais. Nous nous excusons pour la gêne occasionnée, merci de revenir sur cette page ultérieurement.

Media over QUIC: Can Streaming Finally Have Both Scale and Low Latency?

Zac Shenker

Global Director of Domain Strategy - Media, Entertainment, Gaming & AdTech

John Agger

Principal(e) responsable du marketing industriel, médias et divertissement, Fastly

For years, streaming has been forced into a compromise. You either leaned into the lowest possible latency with technologies such as WebRTC, or you prioritized scale and operational simplicity with HLS and DASH. Having both wasn’t really an option.

Media over QUIC (MoQ) aims to break that tradeoff. This time, it’s not just theory or lab work - there’s real momentum behind it.

Traditional Streaming Protocols: HLS, DASH, and Their Latency Tradeoffs

At its core, Media over QUIC (MoQ) seeks to modernize the way online media is delivered. It’s built on top of QUIC - the same transport protocol that underpins HTTP/3. Fastly has supported both QUIC and HTTP/3 for years, and was instrumental in developing and bringing to the internet scale both of those standards. 

Traditional streaming protocols are fundamentally rooted in file delivery. Video is split into segments - starting around 4 seconds, and down to as low as 0.5 seconds - requested over HTTP. Players continuously fetch the next segment during playback. Clients that repeatedly request the latest manifests to locate upcoming media segments also significantly increase overall request volume. It scales, but at the cost of delay: viewers are always waiting on the next segment, and longer segments mean higher latency.

The industry sought to solve this with technologies such as WebRTC, which can achieve ultra-low latency by maintaining persistent, real-time connections. However, they are very complex to operate, and it can be argued that they don’t scale easily. Egress costs from public cloud providers also add significantly to the cost of running WebRTC deployments at scale.

How Media over QUIC (MoQ) Works

Instead of thinking in “files” or “segments,” MoQ treats media as a continuous stream of frames that can be published and subscribed to in real time. Because it runs over QUIC, MoQ also brings a few practical advantages for live delivery:

  • Low latency by default - no need to wait on segment boundaries before sending data

  • Tunable experiences at the player, client, and stream level - whether it’s jumping ahead to stay at the live edge or resuming playback where left off after a rebuffer.

  • Better multiplexing - video, audio, and metadata move independently without blocking each other.

  • Application-aware handling of packet loss and congestion - unlike TCP - delivers stronger performance over real-world networks and enables better prioritization of what matters most to each user.

  • Server- or client-side adaptive bitrate switching gives publishers finer control over how bandwidth and capacity are allocated, and how the viewing experience is tailored for each end user.

  • Pull-based subscriptions ensure that costly processes like encoding and packaging only run when streams are actually being watched. Bytes are delivered on demand, significantly reducing total cost of ownership for streams with few viewers or limited geographic reach.

Instead of pushing segments to an origin and waiting for players to pull them down, online broadcasters publish a live stream once and let a distributed edge layer handle the rest. In practical terms, this equals lower glass-to-glass latency without having to bolt on a separate, specialized delivery stack, along with more stable playback when streaming conditions are less than ideal. It also opens the door to offering alternate feeds, camera angles, or more tailored viewing experiences without needing to spin up parallel infrastructure.

Why Media over QUIC Matters for Live Streaming at Scale

The appeal of MoQ is straightforward: it removes a long-standing constraint in streaming architecture rather than continuing to work around it. For years, the industry has chipped away at latency with smaller segments, chunked transfer, and other fixes - but the underlying model hasn’t changed. 

MoQ takes a different approach by addressing the problem at the transport level, enabling media to move as it’s produced rather than after it’s packaged. The result is a path to consistently low latency without inheriting the operational complexity that typically comes with real-time systems. For broadcasters and platforms, that means fewer tradeoffs - no more choosing between reach and responsiveness, or between simplicity and performance.

Just as important, MoQ aligns with where the web and delivery infrastructure are already headed. Instead of layering specialized systems on top of existing workflows, MoQ offers a more unified foundation for delivering live media at scale. If it delivers on its promise, it won’t just improve latency - it will simplify how streaming systems are built, and expand what kinds of real-time experiences can be delivered reliably over the open internet. The technology will be critical in enabling the next level of scale for live online streaming events.

MoQ Challenges: Standards, Tooling, and Device Support

MoQ isn’t a drop-in replacement for HLS or DASH. The tooling is still early, the standards are still settling, and the operational playbook is still being defined. But after years of trading off latency for scale - or the other way around - this is one of the first approaches that looks like it could realistically deliver both.

There are still several challenges to address. For MoQ, these include device support, media formats (MSF, LOC), security and encryption, as well as authentication, metrics, and logging.

It’s worth mentioning the Internet Engineering Task Force (IETF) working group focused on MoQ Transport. The goal of this group is to define a new transport approach - built on QUIC - that replaces pull-based behavior with a more responsive publish/subscribe system capable of delivering media as it’s produced, not after it’s packaged into segments. The aim is to close the gap between the scale of traditional streaming and the responsiveness of real-time systems like WebRTC, laying the foundation for a unified, low-latency media architecture.

The Future of Streaming: A Look Ahead with MoQ

For an industry that has spent years balancing trade-offs, this is one of the first credible attempts to move beyond all-too-familiar constraints altogether. Standards still need to mature, ecosystems need to catch up, and real-world deployments will ultimately prove what’s viable - but the foundation being laid is materially different from what came before.

For Fastly, MoQ reflects a broader shift toward real-time, internet-native media delivery built on modern transport technologies. We’re closely following its development and expect to play an active role as the ecosystem evolves. We believe MoQ has the potential to significantly improve streaming - and redefine what’s possible at scale.

Fastly will be at the NAB Show next week, where we’re ready to dive into how our support for Media over QUIC can improve your streaming performance. Reach out to your account team to set up a meeting, or stop by and find us on the show floor in the cabana area. In the meantime, you can explore resources like moq.dev to learn more about how Media over QUIC is shaping the future of online broadcasting.

Prêt à commencer ?

Contactez-nous dès aujourd’hui