How Terrarium reframes the compiler and sandbox relationship

CTO, Fastly

December 10, 2018

A few weeks ago we released Fastly Labs — a hub of experimental, hands-on projects at the edge — and with it, Terrarium. Terrarium is effectively a technology demonstration, which enables anyone to show up, write some code in a variety of languages, and deploy it safely within a few seconds. It's the first time we've shown off the work our internal Fastly teams have been putting in to develop a high-performance WebAssembly compiler and runtime.

The phrase "technology demonstration" carries some weight with it. If someone is using the term, they expect that you’ll believe what they've built is new and novel and interesting. So, what is new and novel and interesting about the technology behind Terrarium? And while we're at it, why did we bother when there are existing WebAssembly runtimes? What problem is this actually solving?

Breaking down the problem

Terrarium uses something new that we've been working on inside Fastly for quite some time now. It started with the realization that existing sandboxes, isolates, etc., were much heavier and slower to start than they needed to be. Great work has been done across the industry over the last few years to improve that situation. Just recently we've seen container startup times getting down to the low hundreds of milliseconds. Which, surprisingly enough, puts them ahead of most in-process sandboxes as well.

These sandboxes, in particular containers and VMs, all have such slow startup times because they’re attempting to isolate code which was not intending to be isolated. Typical compilers take higher level code and produce machine code. That machine code has very little in the way of guarantees about its behavior. The sandbox has to assume that the code may attempt to access anything it wants about the environment, that it will try to read or write to out-of-bounds areas of its virtual memory space, that it will throw arbitrary signals, or attempt to make arbitrary syscalls. Which means, essentially, that they need to provide an enormous environment for that code, which mimics the facilities provided by a typical operating system. That's no small feat.

Our realization was that if you can make the compiler and the sandbox work together, rather than at odds with each other, this problem gets significantly easier. This would not be a tenable approach, however, if you would have to develop a new compiler for every language you wanted to support. In the past, that would have been the only option.

WebAssembly changes that. For the first time, we have an intermediate representation which is targeted by multiple high-level languages, with serious industry momentum, and safety built in from the start. The semantics of its operations make sandboxing the code compiled with it much easier than it would be otherwise. The concepts of memory bounds-checking, memory limits, and control flow integrity come baked in from the get-go.

How Terrarium works

What makes Terrarium interesting as a technology demonstration is that it's the first public glimpse of our new WebAssembly compiler, runtime, and surrounding tools that we've been working on. Our compiler takes a WebAssembly module and outputs a library which has been compiled to native machine code.

Your code is uploaded as a Tar file and then run through a compilation pipeline. At the end of the pipeline, your code is deployed into a running web server, which exposes a basic set of APIs to give the functionality necessary to try experiments at the edge. Finally, we wrapped those APIs up into libraries for each of the languages to make it feel more natural to you.

The state of WebAssembly

However, the WebAssembly ecosystem is still getting its footing. The WebAssembly backends for many compilers are still in their infancy. Tooling is still rough. Things like API standards are still quite fluid. What we did with Terrarium is take the three languages with the best current support and do all the work of configuring the compilers, wire up the APIs, create a compilation pipeline, and finally, build an HTTP-based deployment platform for you.

If the details of how all this works interest you, dear reader, you’re in luck. This is just the first post in a series about how Terrarium works. We intend to talk about the shenanigans needed to get each of the languages working in an environment like this, some of the trickier parts of this kind of system, like: memory management and performance, and anything else you have questions about. So give Terrarium a go for yourself, tinker around, and tell us what your most curious about as we move forward.