Revenir au blog

Follow and Subscribe

Disponible uniquement en anglais

Cette page n'est actuellement disponible qu'en anglais. Nous nous excusons pour la gêne occasionnée, merci de revenir sur cette page ultérieurement.

Powering PyPI with Advanced Traffic Engineering

Joe Williams

Principal Engineer, Fastly

Stephen Strowes

Principal Engineer, Fastly

How Fastly is addressing performance through network addressing

The Python Software Foundation (PSF) is dedicated to advancing open-source technology as stewards of the Python programming language. A cornerstone of this mission is PyPI, the Python Package Index, which serves as the official third-party software repository for Python. And it's truly massive! We’re talking about a platform that supports over 953,000 users, hosts close to 675,000 projects, and manages a staggering 7.3+ million releases (source: pypi.org).

Given this incredible scale, ensuring global traffic optimization and peak performance presents a significant challenge. But as a proud member of Fastly's open source program, Fast Forward, the PSF gets donated, full access to cutting-edge technology like Fastly’s advanced traffic engineering, which includes using Individual Provider Anycast to optimize PyPI performance. 

Understanding Anycast: The Foundation of Global Performance

So, what exactly is anycast, and why is it so crucial for a service like PyPI? Imagine having the same IP address available everywhere, globally. That's the essence of anycast. Instead of a unique IP address pointing to a single server (like in unicast), anycast allows multiple servers in different geographic locations to share the same IP address.

When a user requests content, Border Gateway Protocol (BGP) routing, the internet's "GPS," directs their request to the nearest available server advertising that shared IP. This provides several key advantages:

  • Load Balancing: Traffic is automatically distributed across multiple locations, preventing any single server from becoming overwhelmed.

  • Performance: Users connect to the closest server, dramatically reducing latency and speeding up content delivery.

  • Availability: If one server goes down, traffic is simply routed to the next nearest operational server, ensuring continuous service.

What’s the challenge then? Traditional anycast approaches do have limitations. They often treat the entire internet as a uniform network, which isn't always the case. Different internet service providers (ISPs) and their networks can have varying levels of performance and connectivity, leading to sub-optimal routing for some users.

How anycast works: Regardless of where you are located, you are announcing the same public IP address everywhere


Individual Provider Anycast: Fastly's Advanced Approach

Fastly recognized these limitations and developed a more sophisticated solution: Individual Provider Anycast (IPA). Think of it as finding the perfect middle ground between the "one-to-one" connection of unicast and the "many-to-many" simplicity of traditional anycast.

The internet isn't a single, monolithic network; it's a complex tapestry of interconnected providers. Often, these providers are quite geo-centric, meaning their networks are optimized within specific regions. Fastly's IPA leverages this reality by assigning and advertising provider-specific anycast IP addresses. Instead of just one shared IP for everyone, Fastly uses unique anycast IPs that are optimized for specific internet service providers.

When you try to access PyPI, the Domain Name System (DNS) plays a crucial role. It doesn't just resolve pypi.org to a single IP; it intelligently directs your client to the optimal Fastly edge cloud node and, more importantly, the optimal provider network for your specific location.

The Science Behind Provider Selection: Wins Above Nominal (WAN)

How does Fastly know which provider is "optimal" for you? We employ our "Wins Above Nominal" (WAN) methodology where we continuously measure the round-trip times (RTT) for every provider from every major network. This granular data allows us to rank provider performance with incredible precision.

To get a more complete picture of how we provide the best possible performance for the Python Software Foundation, it's important to understand how we’ve taken traditional Anycast routing a step further with Individual Provider Anycast.

The core idea is to significantly lower the median Round-Trip Time (RTT) for our users. We do this not by simply routing a user to the physically "closest" server, but by intelligently directing them to the connection that provides the best chance of a fast RTT. Fastly’s network is connected to thousands of different ISPs, and we use our detailed understanding of these connections to constantly analyze which provider is most likely to give a specific user an RTT below the median. We then bias the traffic distribution towards that provider, but we do so carefully, avoiding the simple "all-in" approach that could overwhelm a connection and make performance worse. This targeted, dynamic strategy ensures that a larger percentage of users experience the best possible performance, effectively shifting the overall RTT distribution to a lower, faster median.

How Fastly ranks network providers

Practical Implementation for PyPI

For PyPI, Fastly’s IPA is a game-changer. It allows Fastly to apply this advanced traffic engineering specifically to PyPI's massive global traffic, ensuring that Python users worldwide experience fast, reliable access to packages.

We've seen tangible performance improvements in regions across the globe. Whether you're downloading a critical library from Europe, Asia, or the Americas, IPA ensures your request is handled by the most efficient network path. This automatic provider selection is critical for maintaining performance, especially during unforeseen internet disruptions, like submarine cable cuts or major network outages. PyPI just keeps working, seamlessly.

PyPI Performance gains with Fastly

Conclusion

The partnership between the Python Software Foundation and Fastly through the Fast Forward program is truly significant. It’s a testament to how collaboration and cutting-edge technology can empower open-source projects. These seemingly small, millisecond improvements in load times and reliability, when scaled across millions of users and billions of requests, translate into a massive real-world impact for the entire Python ecosystem.

As Ee Durbin, Director of Infrastructure at PSF, eloquently puts it: "Fastly single handedly made it possible for our infrastructure to provide the quality of service it has for the past decade." We're incredibly proud of what we've achieved together and look forward to continuing to provide an exceptional experience for the global Python community.