How Fastly Protects its customers from Massive DDoS threats including the novel Rapid Reset attack

Frederik Deweerdt

Engineering - Edge Systems, Fastly

Marcus Barczak

Senior Principal Engineer, Fastly

Wayne Thayer

Senior Director of Engineering, Fastly

Hossein Lotfi

VP of Engineering, Network, Platform, Edge Systems, Fastly

October 25, 2023

Engineering Company news Security Platform

Customer traffic on Fastly is not vulnerable to the massive Rapid Reset DDoS attacks that have been recently disclosed.

At the initial onset of the Rapid Reset DDoS activity Fastly saw high volumes of requests which risked high CPU utilization if not addressed, but our autonomous systems helped detect the method used by attackers and we quickly deployed proper mitigation.

Our protections for massive scale attacks are handled at the edge automatically with detection and defense capabilities that are built into our kernel and network application layer processing stack. These systems defend all Fastly customers from massive attacks because we prioritize keeping the Fastly network up and running for everyone regardless of the package or security offerings they subscribe to. We offer various other security products that allow customers to define more specific security rules for their unique needs, like our Next-Gen WAF, Edge Rate Limiting, and Managed Security Services (MSS), but these massive scale attacks are mitigated for all Fastly customers and traffic at no extra cost. This and other secure-by-design capabilities are built directly into the Fastly network.

Details of the rapid reset attack have been extensively discussed over a number of blog posts by our industry peers but in this write up we want to explain how we are able to detect novel attacks and mitigate them quickly and effectively for Fastly customers. We approach DDoS attacks on our network differently in order to deliver incredible performance and reliability for our customers and their end users all over the internet ALL of the time. Our goal is to make it impossible for them to tell the difference between times when we’re under attack from a record-breaking botnet that causes significant deterioration in other vendors, and times when we’re not.

Timeline of events

Late August 2023 – A novel Rapid Reset issue was first automatically detected at a volume of ~250 million RPS and a duration of ~3 minutes
Our DDoS forensics systems immediately flagged a never-seen-before class of attack and they helped identify the exact method used by the attacker to reach this level of amplification.
Fastly Engineers easily added new capabilities to our DDoS mitigation engine at the edge in order to effectively mitigate the future occurrences of this attack
October 10, 2023 – CVE-2023-44487 (the Rapid Reset attack that Google and others have written about) was disclosed. This did not present a risk to our customers as it was not novel to Fastly and effective mitigations were already in place. Fastly was not impacted.

Let’s look at Fastly’s principles for dealing with massive DDoS attacks, and some of the specific capabilities that played a role in our Rapid Reset defense.

Fastly’s DDoS defense principles

We have three core principles for how we structure our network defenses against DDoS attacks.

Everything starts with rapid, accurate detection of the malicious traffic
Mitigations must be safe to run. Even one false positive is way too many
Our defense tactics should be deceptive, minimizing signals that go back to the attackers

Anatomy and economy of DDoS attacks

There are three basic flavors of DDoS attack: PPS (Packet Per Second), Volumetric, and RPS (Requests per Seconds).

Packet Per Second attacks attempt to overwhelm the packet processing engines along the path (L3 and L4 network layer attacks). They test the performance boundaries of the packet processing engines. They’re cheaper to run because they cost the attacker less network bandwidth. By contrast a “Volumetric” attack tries to overwhelm the transfer capacity of a network by clogging it with data and sending lots of large packets. These have become less common recently, but are still seen sometimes.

RPS attacks poke around your site or application trying to identify computationally expensive objects for which it can then begin to send an overwhelming number of requests. Finding a way to request a complex regex will cause CPU spikes and tax your infrastructure many many times more than requesting a static image that’s served from a CDN.

All DDoS attackers are looking for amplification factors – this would be anything that lets them use the least amount of botnet resources to still launch massive amounts of damage. Attackers have to pay to run their botnets in the same way other legitimate operators have to pay for their computing resources, so it costs them real money to have a bunch of botnet nodes active. They will always be interested in ways that they can have fewer nodes achieving a larger impact. For example, if you can send 1 attack per second per node then you need a million nodes to send 1mm attacks per second. However, if you can find a way to have fewer nodes send many times more requests, then the cost is lower for the same volume. Rapid Reset found a way to have a relatively small number of connections and botnet nodes sending hundreds of millions of requests per second in their attacks.

Novel or not?

Most attacks can be easily grouped into being either well understood or novel. A well understood attack is usually something that has been seen before, often many times, and slowly mutates over time as attackers try to find better ways to circumvent the defenses that are in place, or look for a new weak spot in the defenses where the attack can still be deployed. The OWASP Top Ten are a great example of this – everyone knows basically what they are, but attackers are constantly tweaking and evolving them, and poking your defenses with new approaches, so you have to stay on top of it.

When an attack is novel it means there’s a fundamentally new approach that hasn’t been seen before. These can pack a big punch and do a lot of damage because the fix often has to be novel as well, and it might take longer to create and deploy.

The Rapid Reset attack disclosed on October 10th was considered novel because it relied on a characteristic of the HTTP/2 protocol that had not been previously exploited.

Rapid detection with accurate fingerprinting

Rapid detection is at the center of an effective response strategy. DDoS attacks accelerate quickly and are often over quickly, so effective defenses must be able to accurately detect an attack and distinguish between the good and the bad traffic in real-time.

Attacks often scale from zero requests per second (RPS) to millions or hundreds of millions RPS after just a few seconds, and then it may be over less than a minute later. As you can see in the chart above, when we look at the attacks* we saw at Fastly from July 1, 2023 through October 12, 2023:

90% of the attacks have a total duration of 150 seconds or less.
50% of the attacks are under 52 seconds!

By the time a human can be made aware of an attack and equipped to respond, the attack is often over. Applying the rules at that time would be like getting a vaccine the week after you’ve already recovered from being sick. It will probably help you the next time you get exposed, but it doesn’t fix anything related to the impact of your first illness. Fastly’s automated detection and response is able to detect and respond without human intervention to mitigate attacks.

Sophisticated attacks like Rapid Reset and others require the discovery of distinguishing characteristics to identify the bad traffic. Anything less results in catching a lot of false positives of the organic traffic it’s blended in with, and this results in the negative consequences we’re all familiar with – customer websites and applications experiencing problems and parts of the internet breaking.

Attribute Unmasking for accurate, automated signature extraction

Fingerprinting is a way to identify specific attacks and distinguish them from the organic traffic on a network. In its simplest form you can imagine that you have the entirety of an attack initiated from a single IP address. Normally, you don’t see any traffic at all from that IP address on your network, but during the attack you’re getting a TON. Your fingerprinting can start by simply identifying the IP address from that datacenter, and blocking it, and congratulations! You’ve stopped the attack. The problem is that over time the attackers will get better at more advanced techniques like blending their traffic in with other legitimate traffic so that it’s harder to identify and separate. This also means that it costs more for your defense team to mitigate it (time, resources, computation, business impact, etc). A couple of years ago the Meris botnet did this by taking over infrastructure in a lot of campus networks like hospitals, universities, and other institutions, and routing malicious traffic through them to make the attack look like it was coming from these organizations.

This kind of blending with legitimate traffic complicates things because even if you know the IP address it’s all coming from, it would be catastrophic to block that traffic in such an unspecified way because you would also be blocking a ton of legitimate traffic. That would be very bad anytime, but especially in the case of hospitals, ISPs, and other organizations where a blockage would lead to serious real-world impacts!

To address this problem, Fastly uses a technique we call “Attribute Unmasking” to rapidly extract accurate fingerprints out of the network traffic when we are being hit with complicated attacks. For any request coming through a network there are a huge number of characteristics that can be used to describe the traffic Things like Layer 3 and Layer 4 headers, TLS info, Layer 7 details, and more. Borrowing concepts from AI, our Attribute Unmasking system ingests the metadata from inbound requests on our network, and extracts the elements that match the shape and volume of traffic over time that matches the shape and volume of the attack.

The system starts by testing individual attributes until it finds one that shows some similarity to the curve of the attack on the network.

Now the system has a candidate to work with, and it starts combining that first attribute with others, testing out sets of attributes, and building a curve that gets closer and closer to representing the entirety of surplus traffic on the network produced by the attack. With each incrementally better attribute set that is identified the system shrinks the degrees of freedom needed to further improve the model until it fails to be able to produce a better fit, and has arrived at an optimized fingerprint for the attack.

This might sound like a computationally intensive process, but it’s all occurring in real-time – identification, fingerprinting, and mitigation.

Our Attribute Unmasking system is an area of continued investment for us. It’s already a stunning achievement. We’re extremely proud of what we’ve accomplished and the way in which it is already protecting us from attacks that impact other networks, but we will continue to improve it and expand its capabilities.

Rapid fingerprinting as a differentiator

Fastly works under a principle that we should do as much processing and decision-making as possible on the edge rather than running things through a centralizing function that will inevitably serve as a bottleneck. We can do that because our network is completely software defined – by removing dependencies on specialized hardware and other components like routers (and other components), all of these functions can be run in a more distributed fashion across the servers in parallel. Fastly prioritizes speed, and in order to do this kind of processing without impacting the experience for our customers and their end users, then we need to perform it at the edge. Our distributed processing and decision making gives us the power and flexibility to process, analyze, diagnose, and respond with effective solutions at the edge in this way.

Rapid fingerprinting wouldn’t be valuable if we couldn’t quickly adapt to new attacks and immediately implement our mitigations. Our system is modular – this means that we can rapidly enhance our detection and mitigation capabilities as new classes of attacks are discovered without needing to develop an entirely new mechanism to respond. When something like the Rapid Reset attack comes along, we simply add a few new functions to our detection and response modules, which keeps our response times incredibly short, even for novel attacks.

Fastly customers experience a direct benefit from this innovation, and the whole point is that they never know when it’s happening because it works! This type of automation is extremely difficult to create. It requires a truly distributed architecture like only Fastly runs, and a pool of talent that isn’t easy to assemble, but it’s worth it when it makes a noticeable difference in the quality of service we are able to provide.

Safe mitigation that ensure low false positives

Every automated system has a risk of generating false positives and blocking legitimate traffic. The industry has a long history of outages where an automated system alone, or in combination with human error created a bad situation, but if you’re too relaxed then you’re starting to let actual attacks through. In order to walk that fine line we basically apply two categories of security rules on the network. Some are part of a basic set of rules, and these are always on – always active. We consider these rules safe to be running all the time without consequence. They’ve been through lots of validation, and normal code review processes, and are considered safe at all times. Our Attribute Unmasking rules are extremely effective, but their constantly changing nature introduces a higher risk of generating false positives. We match these rules up against a “distress signal” so that they are only applied while the network is currently being attacked because these are the times when they help with mitigation. This limits the impact of those rules and keeps them from catching false positives when attacks are not occuring.

Deception as an attack defense strategy

Information is power when it comes to DDoS attacks. When attackers learn something about a network or their previous attempts that gets used to plan their next attack. It’s a cat and mouse game of constant evolution, and by withholding information from the attackers you are making them work harder to figure out if they need to change tactics, and how they should adapt. When most platforms detect an attack they act swiftly to close the connection on the attacker, or deny access to their platform in another way. This signals to the attacker that they’ve been discovered, and also that if they try the same approach again it is likely to be more easily identified and blocked.

At Fastly, we intentionally minimize the amount of information (of any form) that is sent back to the attackers. One example is that we may leave the connections open or use other tactics that make the attacker think they have not been detected, and that the attack is going as planned. When Alan Turing and the team at Bletchley Park worked to break the Enigma ciphering in World War II, they knew it was important to not let on that the code had been broken, because the enemy would adapt more quickly and eat away at their advantage.

A recent example of Attribute Unmasking

Here is an interesting example of an attack that was automatically detected by the attribute unmasking. The system detected an increase in the volume of traffic on our network, and within seconds it compiled a signature that effectively matched the curve of the attack. When we reviewed the details of the attack the next day, we looked at what the headers were inside of the attack and the User-Agent was quite peculiar – it looked like this!

User-Agent: 🤡🤡🤡🤡🤡🤡🤡🤡🤡🤡🤡🤡🤡🤡🤡🤡🤡🤡🤡🤡🤡🤡🤡🤡🤡🤡🤡🤡🤡🤡🤡🤡🤡🤡🤡🤡

*Attack duration data was collected by looking at the ingress requests to Fastly network from 2023-07-01 to 2023-10-12. The onset of attack is registered when a 30% increase from anticipated baseline is detected, and it ends when traffic is back to expected levels. We have excluded known organic traffic spikes and load testing from this dataset.