From content to logic and beyond: The future of Fastly’s edge | Altitude NYC 2019

Fastly’s Chief Architect, Sean Leach explores both the current state and the untapped potential of logic at the edge — from paywall authorization and A/B testing to dynamic ad delivery.

(00:00):

I'm going to start with a quick story. The last couple of weeks have been extremely stressful just trying to get ready for the launch of the product that I'm going to be talking about here, doing my slides which, for those who know me well, doing slides is a very stressful event for me. The talk is fine, like I feel good here, but it's like doing the slides are the worst. So this morning, it's like eight o'clock I'm in the lobby. My head's thinking about a thousand different things and the elevator doors part and out comes Hooman and his hair is down and it looks like there's somebody behind him with like a fan just blowing his hair up. But it was like the wind from the elevator shaft, like do you remember the old '80s metal videos where the person's singing and their hair's flowing?

(00:51):

It was a sign that I needed that everything was going to go all right today and that it was going to be a good day. So having said all that, that's it. Thank you very, no. So I'm here to tell you a bit about our Compute@Edge product and in kind of a classic Fastly fashion. It's not just going to be me up here showing you a video or droning on about some of the tech. I'm definitely going to get to that because I know that's what a lot of you want to see. You want to see the code, you want to see some cool videos of how the product works. You want to learn about what does it mean to you. But I want to make sure we give you a good glimpse into the decisions that we made. Like why did we do what we did?

(01:34):

We listened to a lot of what you've been telling us over the last several years. And so it kind of starts with me. One of my jobs is to go help where I can. And in this case, I wanted to help my friend Tyler realize some of the vision of our edge compute product. And I love this picture and I try whenever I can to include this picture of Tyler in every talk that I do. I have another one of Hooman fishing that I also try and work in every presentation that I do as well. You won't see that one today though.

(02:03):

And what this was was they had this vision of like the technology, what edge compute should be. So I jumped in earlier this year to work with an amazing engineering team. It's such a blessing for me and it's very humbling for me to be the one up here showing you all this project because there were so many amazing software engineers that worked in this. I can't write software, I can like program, Excel and Google slides. But that's about it. But I'm going to tell you a little bit about the story and how some of the decisions that we made and then give you kind of a preview of the code and the tech and whatnot.

(02:39):

So eight years ago, when Fastly launched, we really redefined what was possible at the edge. You know, CDN, it's pretty boring. A lot of static content. It was a very solved problem but Artur talked about the vision of the company early on was was to enable everybody to do a lot more at the edge. The edge is such a great place to do computing, to run logic and so when we launched the product early on we gave you a lot more power than you were used to and thankfully you all appreciated it and did some amazing things with it, which I'll talk about in a second here, but this is the actual homepage from 2011.

(03:18):

Anybody here sign up in 2011 before you could actually get on the platform? Denenberg, of course. I did as well like, this is before I joined Fastly. You could sign up, you actually couldn't get on and use the product yet. So we provided you all, this was key and we'll get back to this, this will be a theme that you'll notice. We provided you all some core building blocks, some core technologies that you can take advantage of. We provided you full control, we provided you Instant Purge. I know that sounds like such a nerdy phrase in technology but Instant Purge really changed the game when it came to the CDN.

(04:00):

We provide you real-time configuration changes and then we provided you real-time visibility. And most importantly almost was we provided you code at edge and that's what I'm going to be talking a bit more about today is, how do we take the code that we have allowed you all to write for the last eight years and take it to that next level. And you all took that technology and you did far more than we thought possible with that tech.

(04:25):

We're constantly amazed of all the things that you all have done with that code or with that tech. You did A/B testing, you have a lot of content out of the edge, you want to be able to test and see which piece of content does the best, whether it's based on a particular color of a button, a particular words where they are on the page and how it affects your SEO ranking. But you want to do that all out at the edge with cached content.

(04:51):

A/B testing allows you to do that. We allowed you to do A/B testing out at the edge. Paywalls, you all have a lot of content that you both want to serve out at the edge, save those origin costs, right? Both from the egress costs of the network to all those compute instance or data center servers. Your goal is to do a lot less at the core and do out at the edge because it's much cheaper and it's faster.

(05:14):

Paywall is another example of that. You have content that you want to serve out at the edge. You have an authentication system and to make sure people have access to that content, you want to make sure you do all of that at the edge so you don't do the slow lookup back to the core. Geo-targeting, one of my favorite use cases I heard about Fastly or our customer at Fastly was a company who every year at New Year's Eve, their backends would fail because of the amount of traffic that they would get. And so what we allowed them to do, and it's just because they couldn't cache enough content because all of their lookups were done based on lat/long of where the actual end user was on their phone.

(05:53):

Because of our purging capabilities and because of our caching capabilities, we allowed this customer to cache very specific geo areas for each individual user and that next year at New Year's Eve, they had no problems whatsoever. Site stayed up, it was amazing. Like such a great story. And then you all have probably done a ton of this, right? With advanced routing and whatnot. We've always provided that as a service. It's a great service. So you all did amazing things but you wanted more. Of course, you do. I don't blame you. I would want more as well.

(06:29):

You wanted technology and languages that you knew and loved, right? You hire all these engineers, you want to train them and let them use the tech that they're used to. You don't want to have to learn a new language, a specific language for a particular environment. You want to use the languages and tech that you know and love. You wanted a more expressive programming model and I'll talk about this in a bit, but you wanted to be able to write edge apps the same way that you're writing web applications today top to bottom, main function takes a HTTP request returns a HTTP response as opposed to some of the state machine stuff that you previously had to learn and kind of maintain in your head as you were coding.

(07:09):

And you wanted an insane performance, of course, it's a great ask. And as Tyler talked about, you wanted it safe and secure, like that's what you trust us for today. But was there more that we could do from the safety and the security perspective? And you just basically wanted more power. So I'm going to date myself here, but of course Tim "The Tool Man" Taylor great. I won't grunt, don't ask, but just pretend that he was grunting right now.

(07:40):

Application architectures are changing as well. You all are breaking up your monolithic applications, you're moving to microservices and as part of that journey, you're starting to take advantage of, everybody knows this, serverless, right? Of course, there are servers. Classic joke, everybody's like, "Oh, there's no servers." There's servers, obviously. But you want us to deal with the servers, you don't want to deal with the servers.

(08:02):

And so serverless has become one of the fastest-growing technology trends that we've seen across the board with all of you, right? Take pieces of functionality, write business logic and run it in a compute environment that you don't have to think about. It auto scales, it auto secures, et cetera. There are a lot of benefits to serverless, right? You don't have to worry about provisioning, you don't have to worry about scaling. You have to worry about security or patching. There's a lot of great benefits of serverless that you want to take more advantage of.

(08:34):

Our friends at CB Insights released this graph and the survey here showing that FaaS — function as a service — in serverless environments is the biggest growing area of the public cloud service, right? I don't have to tell you all this, but it's always good to show cool graphs like reiterating what I'm saying. And you know more than even so the containers. So not only that serverless according to Google will save us from robots, Google itself and Western civilization.

(09:03):

So if nothing else helps, it will do that too. I'm pretty sure that they actually say that for anything you type into Google, but it really helped to get a laugh out of everybody. So having said all that, having said, "You want more, you want to build more on us, you're using much serverless, you want to start using more serverless," it was a perfect time for us to decide, let's build this Compute@Edge product. So that's what we did. So yeah, no, clap. It's actually pretty amazing.

(09:35):

It was a lot of work by a lot of really smart people and I'm really excited to show you today. So let's talk about a few features. What this is, is it's an edge serverless environment, right? Serverless, edge: two great things. Let's put them together, give you access to them. Runs globally in every Fastly POP which is key when you talk about Lambda or some of the other serverless tech or serverless technologies. When you launch it, it runs in one data center. Ours runs everywhere. Every single one of our data centers on every single one of our machines.

(10:10):

It's incredibly fast, Tyler talked about this. 35 microsecond startup time. Competition that we've seen out there, five milliseconds to a hundred milliseconds. That is a pretty significant difference in performance because we had to. We couldn't sacrifice performance for anything. We couldn't say, "We'll give you these great new languages but it'll be slow or it will be fast, but it won't be secure." We could not sacrifice any of those things. So it has to be incredibly safe as well and it has to be incredibly powerful. Just picture that Tim "The Tool Man" Taylor graphic.

(10:44):

And it has to be one platform, can't have different APIs, not different UIs. It has to be one platform baked into what you all are used to and integrate well with all of your existing VCL and code that you've written today. So we spent a lot of time listening to you. We spent hours on the phones over Zoom, you name it, hearing from you. We asked you a bunch of questions. What do you love about Fastly? What don't you love about Fastly? What kind of use cases do you wish you could do on Fastly?

(11:16):

We spent a lot of time on those and crunch the data. And then we spent a lot of time R&D...ing. That was my joke too up there. You feel free to laugh. There's tech that Tyler's talked about over the years. WebAssembly, right? Game changer for us because when we talked about wanting to have multiple environments, we wanted to give you all the ability to write these apps with multiple languages.

(11:44):

WebAssembly was the first path to that. It generates an intermediate code from all of those languages. So then what we had to do is we had to build the runtime environment to let you securely and fast and with high performance run that code. That's where Isolation, which is an internal project that we had been working on that Artur talked about, which was all about taking that intermediate code and converting it to object code and then running it in a safe environment.

(12:11):

And then we gave you kind of a preview of this with Terrarium earlier this year. So you can kind of see the progression of how we did the R&D around this project. Then in March, we opened sourced what we call Lucet. Lucet is that compiler runtime. We worked with Mozilla and others to build out this environment and because we love open source and have taken advantage of open source and it's a core part of us being successful today. Do you think we weren't going to open source this great new technology of ours and keep it to ourselves? No, we opened sourced it. We gave it back to the community.

(12:45):

So some of the most critical pieces of this technology is open sourcing that Lucet repo and the surrounding repos. And again, if you have any open source projects, we give free CDN. So if you have anything come talk to any of us, we'd love to help you out. If you have anything that you need it for, these are some of the folks that are on us today. So we get asked this quite a bit as well. Anybody here really familiar with V8? It's okay gave you don't want to raise your hand. So V8 is a runtime environment that was built into Chrome to separate out your tabs, right? So if a tab crashes, it won't affect your other tabs. It's useful, right, with a browser?

(13:25):

A lot of people ask us, "Why did you build your own? Why'd you build your own tech? Why didn't you use V8 like everybody else?" Well, there's a problem with V8. One is it was built for browsers. Browsers are a completely different environment than an edge and big powerful servers, global, et cetera. And I like to compare it to an early decision we made around our routing infrastructure at the very beginning of Fastly. Instead of us building these really expensive, you've seen some of these talks in these blog posts around this. Instead of us going out and buying really expensive routers from the Junipers and the Ciscos, et cetera of the world, we wanted more flexibility with our platform and we had to.

(14:09):

So we ended up writing our own load balancing and routing infrastructure. And you can go and read about it in these blog posts. They're still up still some of our most popular blog posts out there. We had to do that. We would not have been able to get to where we are today without making those early decisions to build our own tech. And I think history is going to prove in this instance the exact same thing, that this is the right decision for us to build this ourselves and not depend upon a technology that we don't feel is fit for this environment. And then we wanted to provide core building blocks for your edge apps. Some of those building blocks are support for many languages. I talked about you all wanting to use the languages that you know and love. So Rust is the language we chose as the launch language.
(14:56):

We want you to give you access to reading and writing the request and response body. We want to give you concurrent fetch so you can talk to multiple backends at the same time and piece together content from those backends. And we want to take those building blocks and when I go into my demos here in a minute, you'll start to see how these core pieces are critical for so many other use cases that we'll get to, right? These are the three most powerful pieces that you can Lego together and come up with something great. And why did we choose Rust? Well, we love it internally. A lot of our systems are built on it and, in fact, a lot of the Lucet technology is written in Rust and you all love it. It's the second most language from a growth perspective in GitHub's survey. This is something GitHub did earlier this year, 235% growth.

(15:51):

It's loved by far the most. Rust is the most loved language by far compared to anything else. So we wanted to give you a powerful language that you also love and so that's what we focused on for our first launch. One of the most critical pieces of this technology is if you're familiar with Postfix from back in your UNIX systems programming course. Postfix is a way to expose the computer to your program. How do you talk to file sockets? How do you talk to a network SOC or file descriptors? How do you talk to network sockets? How do you actually program the environment?

(16:23):

Well, critical piece of this world is called the WebAssembly Systems Interface and WASI is the acronym for it. This is a quote from Solomon Hykes who was the founder of Docker. WASI is a critical piece of this kind of core that we've built into the technology. It's how we expose a UNIX computer to all of you when you program it so you can use all the Rust crates that are out there or when we start having multiple languages, you don't have to port a bunch of your third party dependencies.

(16:52):

This is the founder and CTO of Docker, basically saying if WASM and WASI had existed several years ago, we wouldn't have needed Docker. So it's going to be a very powerful environment for all of you to write great code. So here's some example pieces, right? Here's body access, nice single line. You have full access to the body, the requests, and the response body. Read it and write it. Think of all the things you can do with that.

(17:18):

Concurrent multi-fetch. Sounds like the nerdiest thing ever. It's amazing. You can talk to multiple backends in parallel, take that data, stitch it back together. There's so much you can do with that, that it's exciting when I tell people then they're like, "That doesn't sound that exciting." But it really is and there's more coming, right?

(17:40):

There's things that you've asked for us that are some of your most important things, right? You've asked for more. You want to be to use the idiomatic syntax of the language, right? You don't want to just use like a very DSL specific environment in Rust where you're basically writing a DSL, but then there's also some Rust sprinkled in. We're doing everything so it looks like the language that you're used to, right? Like if you're using Go, it's going to look like Go. You're going to do everything the Go away. If you're going to use Rust, you're going to use async and the base technology within Rust.

(18:13):

Key value store, that's one you've all been asking us for. You've been asking us for full tracing and debugging support. That's critical. The virtual file system, there's a lot of great things that we're thinking about that these are the things you're telling us are the most important. So what can I do with all this power? Let me show you some cool things. First, who hear uses GraphQL and is excited about GraphQL? Be honest. There's a lot more of you than that. So GraphQL, this is a Google trend showing how exciting GraphQL and how important GraphQL is for all of you. We get asked about it all the time.

(18:48):

GraphQL is important because it allows your developers, especially your front end developers to be able to pick and choose the data that they want to display on the page from multiple backend APIs. So instead of them having to build a new rest API for every component that they want to show on the front end, they can specify the information. So it's all about like development velocity, to be able to build great, highly personalized experiences. But one joke that Denenberg likes to tell me is like he's heard that like front end devs don't like back end devs or something. That's why everybody loves GraphQL because you don't have to talk to your back end developer anymore.

(19:26):

So GraphQL, again, it shows you this is a diagram showing kind of the interaction between the GraphQL and whatnot. But what I want to be able to show you is some of the more intricate details of it. So here's an uncachable private users API. JSON API, right? User/2. Here's a product's API that is cachable. Previous API is personalized, can't be cached. This API is cachable. So you want to make sure you cache it. So what we want to be able to do is, is on every request we want to be able to go to the back end for the requests that we need to. But for the things that we don't, where we can pull from the cache, we want to intermix those results, right? It saves a tremendous amount of backend requests.

(20:14):

So this would be the GraphQL query that you would send. You're saying, forgive me, the user with ID f2, and give me the name, email cart, and all the information about the cart. And then what you want to be able to do is you want to pull all those cart items down and serve it back up to the end user. So take one request from an end user, you're going to do multiple backend responses to find the data that you need to be able to show this cart, some of which will be cached, some of which won't. Stitch all that data back together and then give one answer back to the end user.

(20:47):

Some more cool code, Rust examples, pretty neat. And then you want to get a GraphQL response back, right? It's just JSON that you'll serve back to your mobile app or whatnot, or your front end. So now I'm going to show you a video of our new CLI that you're going to be using with all of this. So if you watch, you run Fastly package build, it's going to take and pull in your crates, pull them into your environment. It's going to include the Fastly crate, which is how we exposed your cache and the variables you use to set and VCL whatnot. It's going to create a .WASM file, G zip it up, and then it'll be ready to be pushed out to the Fastly network. That took, what, a few seconds? That was pretty cool.

(21:30):

Now you want to deploy it. Boom, now it's deployed out to the edge. Just like you used to do with VCL. You've done it a hundred times. You do the exact same thing with this powerful Rust application that now has access to so much more of the technology. And then here's a demo, what you're going to see on the left is the GraphQL query. On the right, you're going to see the schema itself and we're using just the open source Juniper Rust GraphQL library. And then in the middle, you're going to see the response and if you look down at the very bottom, you'll see the three queries, one to the users API and two to the products API. And what this is going to do is we're going to change to get a new user, user 2 and we're going to add in some new data as part of the request.

(22:28):

So usually that would blow the entire cache up and you have to go all the way back to the back end for everything. If you noticed, we did one less back end requests. Because what we were able to do is pull the piece of that response out of that JSON, cache that and then we could pull in additional data from other back ends and so we can start to cache individual snippets based on the responses.

(22:52):

Another cool use case of this is Dynamic Manifest Manipulation. There's a lot of you here who are in the OTT — over the top — video environment. This is another one I get asked about all the time and I'm stoked about this one. So what this allows you to do is the manifest file, if you're not from the video world, basically points to all the individual video files. When you're watching anything online, you're getting your player is playing and it's pulling down these little segments of video and audio, these little binary files and the manifest file basically tells it the order of these files and which rendition to show and whatnot.

(23:27):

So some amazing new use cases can be expanded if you want to cache that manifest file, right? You don't want to go all the way back to the core, because that's where your coding infrastructure is extremely expensive, it's hard to scale. So what you want to be able to do is serve that manifest file, then cache it for each individual user. So you can start to do personalization on this manifest file, that's something you could never do before.

(23:55):

So server-side ad insertion is something I get asked about a lot. Server-side ad insertion is pretty amazing for the OTT world where, since you know so much about the end user, like, if I'm sitting at home watching the Super Bowl on one of my streaming platforms, they know who I am, right? So they can show me a better ad instead of them making a single ad for this particular event that they just blast out to 3 million people and hope it's relevant. Now what all of you can start to do is you know about your end user, you know what information they want to know more about. You can do customized advertisements based on what you know about the same users while still showing that or caching that manifest out at the edge. So we call it edge-side ad insertion. Talk to me later about that one. That's a really critical one we've been asked about quite a bit.

(24:45):

Multi-CDN is another one you've asked about. Like wait, the CDN is talking about multi-CDN. Isn't that like a taboo conversation? We're not supposed to talk about that? Well, you know us. If it's important to you, it's important to us and what we get asked about a lot is how do I do better multi-CDN for big live events? There's a few ways you can do it today. You do it in the player but you don't control the player all the time, do you? Roku, some of the other tech that you have to build towards, you don't have access to that logic. So you have to do whatever the player allows you to do.

(25:20):

You can do it DNS, it's not a bad way, but sometimes it takes five, 10 minutes for that failover to happen. And you can't target performance based on an end user because all you see is the recursive server. You don't see the actual end user. And then there's the manifest that nobody thought you could use for things like this. So this is how we've built out this amazing, and I'm going to show you a demo here in a second, that shows you how the manifest file can be rewritten on every single request by querying a performance API on the side to check for this user, what's the fastest CDN? Is it up still and running? Instead of maybe every 10 minutes, every five minutes checking a CDN and then using DNS. Then there's another five minutes for fail over time. So we found a flaw in this method.

(26:08):

So I won't get into it too deep. I want to talk to you after the fact around any HLS geeks in here that want to talk about it. But we actually found a flaw and one of the methods that we chose as part of this demo. So we were scrambling at the end, but what we found was there's a part in the spec, so sometimes you all know and in tech you have to kind of pivot. There is a flaw in the spec or there's a statement in the spec and the way we do our rewriting, if you look here on this example is the manifest file itself, we changed the path to those individual video segments based on which CDN is performing best for that particular end user. So you can see that there. We actually will put a handlebar's template in the manifest file and then using the rest body access, we'll rewrite that to point to the best CDN.

(26:54):

Well, we found a flaw with that where the HLS spec says you can't do that. You have to have the same domain or URL for the life of a particular segment. So anyway, we spent a bunch of time, we got creative and we worked around it. I'm happy to talk to you all about it after the fact, but what we allow you to do is using Mux or Cedexis or Catchpoint, whatever your third party performance API is, you can do a side lookup using serve stale for all the caching nerds out there that will allow you to do this async side lookup on every single request for that user who is the fastest CDN, which one's still up, and then rewrite the video files to point to those to that particular CDN, right?

(27:33):

Again, it's just Rust code. Get to do everything great that you do in Rust today. Get best path — this is where we actually query that side API asynchronously off to the side. Get the best CDN. Then rewrite that manifest file to point to the right one. Little code golf, just some cool Rust code. I thought I wanted to show you there. And then you can see that the manifest response is rewritten in real time as it passes through us. So I'm going to show you a video here. What this is going to show you is you see the one, that means it's on CDN1. If it switches to two that means that the performance API said to switch to CDN2 and you'll see at the bottom they're using our real-time logging.

(28:13):

We're streaming out the decision process so you can audit this system like, hey it's Rust code. Because we've had people tell us in the past, "We don't trust you to do multi-CDN for us because you are in the list as well." Like, oh, amazingly Fastly was the fastest 100% of the time — weird. But with this, what's great about it is, it's your code, it's your Rust code. You're using your own third party performance and metrics API and we're going to log out on every decision why we made this decision. So watch the video here. Terrain, it's very soothing. It's one, oh it's two. It switched. That was amazing.

(28:55):

There's so much going on behind the scenes. I know it's like lame. You're like, "Oh wait from one to two great, nice work." But behind the scenes, it was doing so much. So there's tons of other use cases that you can do with this. Again, I showed you building blocks. One of them that we're excited about is API security. Now that you can, normally with a CDN, you're handling things that are outbound, right? Leaving the CDN, leaving your back ends. Now what we can do is because we can read and write the request and the response body, we can start to look at the content going through us and doing things like schema validation of your JSON responses, looking for malware embedded in those API responses. There's a ton of security value that we see in this.

(29:41):

So what do we do next? A lot of you joined the beta. Thank you for doing that. It was amazing to see the responses that we have on the beta. We're overwhelmed right now with those responses. We'll get to you as soon as we can. We've launched this in a private beta because we want to give, we're going to slow-roll this out, get a lot of feedback, hear from you what more that you want and then we'll start bringing more and more folks into the beta over time.

(30:07):

But just all I ask is be patient. I know you're excited. We're excited. We want to give this to you. We want to give you full access to it. We're just going out about it by a very measured process of bringing people on to use this. So please come and find me afterward. Talk to me about that's Tom Hanks. You guys, come on, that's a good joke. Nothing, thank you. You could clap if you'd like to. That's okay. But anyways, thank you very much.