Khan Academy delivers free education to millions globally with Fastly's edge cloud platform
The challenge
Khan Academy's mission is to provide free, world-class education to anyone, anywhere. For Miguel Castillo, Staff Software Engineer on the infrastructure platform team, that mission carries personal weight. "I grew up in a low income family. This work resonates with me," Castillo said. His team manages the services and infrastructure that deliver Khan Academy's content to users worldwide, many of whom access the platform from areas with unreliable internet connections or only through mobile devices.
As Khan Academy expands its district offerings globally, including in countries with less reliable networks, the infrastructure team faces mounting challenges to scale. The challenge isn't just about handling volume. Khan Academy serves users in regions where internet access is spotty, meaning every millisecond of latency matters. As we serve users from all over the world, it is important to provide a consistent and reliable user experience.
The solution
Khan Academy's relationship with Fastly began over a decade ago through the Fast Forward program, which provides free services and support to open source projects and the nonprofits that support them. What started as a straightforward CDN implementation has evolved into a critical partnership. "Fastly has become an extremely important part of our ecosystem," Castillo said.
Pushing the Limits with VCL
The team, eager to push the boundaries of what was possible as part of the Fast Forward program, leveraged Varnish Configuration Language (VCL) to manage complex authentication , fine control over caching, and routing logic. However, the complexity of the VCL code and infrastructure grew as new requirements needed to be implemented. And over time fewer engineers understood the complex mechanics of VCL. The request flow when users needed to be re authenticated revealed the complexity of the system: a request entering the Fastly network would be forwarded to a Google App Engine to do user re authentication, and then loop back back to the Fastly network to complete user requests. This convoluted setup not only strained the system but also increased potential points of failure, representing a clear opportunity for improvement.
"That was a big loop of requests that needed to happen," Castillo explained.
Simplifying architecture with Fastly Compute
The migration to Fastly Compute transformed Khan Academy's approach. "All of our stack at Khan Academy is built in Go," Castillo said. "We were able to leverage a lot of the code and infrastructure that we had already written for our GCP services." By migrating authentication code and encryption logic from VCL to Fastly Compute, the team streamlined request flows, reducing redundant network hops and simplifying complex workflows, such as user reauthentication. This shift increased code reuse for tasks like unit testing infrastructure, making the system easier to maintain and more efficient. The new setup allowed them to run unit and integration tests more effectively, ensuring changes could be validated in controlled environments before deployment, reducing the risk of errors in production.
The results were immediate. By eliminating multiple hops between services, Fastly Compute streamlined request flows, allowing it to interact directly with App Engine and continue the workflow. "That reduces the brittleness because you are reducing the number of hops that a request goes through. Fewer hops reduce the risk of failures and make the system more reliable," Castillo explained.
For users in remote parts of the world, streamlining the request flows resulted in a more efficient and consistent user experience. "Previously, the browser had to make two separate calls—one to load data and another to load the HTML, resulting in two round trips," Castillo explained. "With Fastly, we consolidated this into a single call. By prefetching user data from the Chicago point of presence (POP), we eliminated unnecessary delays and made the experience more seamless." This reduction in load time improved reliability and availability, ensuring students could access educational content more seamlessly, even in areas with challenging internet conditions. 'It’s all about making the platform as easy to use as possible for every learner, no matter where they are,' Castillo added.
What impressed the team most was how well performance held up when introducing Fastly Compute as a new hop in their architecture. "For all intents and purposes, our user experience didn't change," Castillo said. "Latency remained stable following the introduction of Fastly Compute into the service chain.”
Blocking malicious traffic with bot detection
When Khan Academy rolled out Fastly's Bot Management, the security benefits became immediately apparent. "We had millions of bot requests a day." Castillo said. "They were literally trying to brute force logging into our system." The bot traffic wasn't just a security concern, it was putting stress on Khan Academy's backends and driving up costs in Google Cloud Platform.
In response to unusually high numbers of malicious requests by bad actors, Khan Academy enabled the Proof of Work low-touch integration into various endpoints that were particularly hard hit. In addition to immediate gains in protection and performance, this provided very rich data on which to plan a more comprehensive roadmap for Fastly integrations. For the security team, it’s a matter of having the right tools for the job and being able to rely on those tools. This work provided the confidence to deepen the integration between Khan Academy and Fastly’s security and intelligence services.
When combining the Bot Management capabilities with Fastly’s Next-Gen WAF, the team gained additional protection against abuse. Bad actors were creating multiple accounts using email aliases and sending spam to legitimate users. "We've been able to mitigate a large fraction of the bad actors with the help of the Next-Gen WAF and are working with the new insights provided by Fastly to keep increasing the fraction of true-positives," Castillo said. The security team also uses these tools to identify SQL injection attempts and other attacks.
Automatically stopping attacks in their tracks with DDoS Protection
Another key capability provided to Khan Academy by Fastly is DDOS Protection. This is a product enabled by Khan Academy which automatically detects and mitigates application DDoS attacks. Fastly's DDoS Protection leverages its Adaptive Threat Engine to identify malicious traffic and automatically take appropriate action to block it, ensuring threats are mitigated without any manual intervention. Despite Khan Academy experiencing massive ebbs and flows in traffic that can represent significant load, the solution accurately mitigates attacks behind the scenes without acting on legitimate traffic spikes.
“We’ve seen DDoS Protection activate multiple times, like during an attempted attack where traffic spiked by over 17,000%, and it automatically blocked the malicious surge before it ever reached our core services.” said Castillo.
Scaling for global training events
Khan Academy’s infrastructure encountered a major increase in demand during training events in the Philippines, where hundreds of thousands of users came online simultaneously. Recognizing this as an opportunity to optimize performance, Khan Academy worked closely with Fastly support engineers to identify and address bottlenecks. Together, they made improvements that ensured the infrastructure could handle such spikes seamlessly in the future. When the next training event occurred, the platform scaled effortlessly. "We were able to handle a significant amount of traffic coming from the Philippines that we were just not used to," Castillo said.
The support relationship has been crucial. "The response and the help that we received from Fastly made all the difference,” Castillo said. "Once that was clearly communicated, we have gotten way more than we could have expected in terms of help and support." For a team managing such a critical single point of failure, that support provides confidence.
Key takeaway
Khan Academy's infrastructure team now looks at Fastly as more than just a CDN. After successfully migrating to Fastly Compute, Castillo and team started evaluating other services that could move from Google Cloud Run to Fastly. "Moving more services into Fastly is definitely something we're considering," Castillo said. "You would only make that kind of assessment if you're confident in the platform."
Like all security teams, Khan Academy’s have to find the highest value tools available to keep its users, data, employees, and infrastructure safe. The Fastly infrastructure has proven invaluable in that regard by not only providing Khan Academy with the sorely needed Bot Management and DDOS Protection, but also by classifying and tagging 100% of our traffic with intelligence with which we can make more informed decisions and take faster, smarter action.
For a nonprofit serving millions of students worldwide, the Fast Forward program delivers value beyond cost savings. "The time that is spent negotiating contracts, the time that is spent justifying purchases adds up,” Castillo explained. "That is the sort of stuff that is operationally just really burdensome. With Fastly, we don't have to worry about that." Instead, the team can focus on what matters: making sure students everywhere can access free education, regardless of their location or resources.