Code-splitting and minimal edge latency: the perfect match

Principal Developer Advocate, Fastly

November 02, 2020

Fastly Fiddle, our code playground tool, is a React single-page app that uses the excellent Monaco IDE component that powers VS Code. Problem is, Monaco is huge. And most uses of Fiddle are read only. Do we really need to load a whole IDE to display some non-editable code? No! Is lazy loading code that's cached at the edge really fast? Yes! Can I stop asking rhetorical questions and tell you how to do this? Sure!

As single-page apps have become more and more popular, it's been increasingly tempting for web developers to ship huge amounts of JavaScript upfront. A classic example is a news website that loads a comment tool on every page load, even though most users won't post a comment.

There's a trade-off here between three things:

Developer experience: Bundling all the code together is easier.
User experience on initial load: Faster is better
User experience during use: Instant is good

The ship-it-all-up-front method optimizes for developer experience and, for highly engaged users, guarantees all the features are primed for instant use — but at the expense of casual users getting a really slow first-load time.

Conversely, doing a lot of code-splitting can substantially improve the initial load time but increases development complexity and introduces latency to operations that have to load additional code from the network on demand.

However, developers often overestimate how engaged the average user is, and therefore also overestimate the value of shipping a lot of code upfront. And the pain of loading code on demand is mostly an issue of network latency, which is where having it cached at the edge is going to make a huge difference.

Fastly fiddle and the 2MB IDE

We're certainly not innocent parties here. Fastly's Fiddle app, a project I started about three years ago, has many dependencies, but all pale in significance compared to the monster that is the Monaco code editor. The create-react-app framework documents an easy way to analyse the bundle produced by the build, and, well, here's the result:

monaco-editor, at 2.1MB, is 80% of the bundle. When Fiddle loads, it normally displays several instances of Monaco immediately, so you could argue that it's reasonable to load it right away, but it's also common for Fiddle to be used to 'embed' interactive code examples within our developer hub, and those code examples are read-only. There's one right here on the Developer Hub homepage:

There is clearly no reality in which it makes sense to load a 2MB code editor component and use it to display non-editable code!

Code-splitting to isolate the big dependency

Fortunately, code-splitting React apps is pretty easy. First, find the point in the app where the large dependency is used as a component. In my case, a good choice is a component called VCLFunctionSet, which wraps all the instances of the code editor. I found it using React Devtools:

Then, I needed to find the file in which that component is used and start by changing the imports. I added an import for the Suspense component from React core and changed the line that imported the really big component into a special lazy import:

import React, { Suspense} from 'react'; const VclFunctionSet =React.lazy(() => import('./components/VclFunctionSet'));

Later in the file, where VclFunctionSet is used, I wrapped it in a Suspense:

<Suspense fallback="Loading...">
 <VclFunctionSet />
</Suspense>

If you're using create-react-app, then this is all you need to do to trigger the build process to split your code into multiple bundles.

Edge-caching the modules

Already, the Fiddle app was loading noticeably faster, because the browser can render most of the UI before loading Monaco. Let's look at the network graph:

Another benefit we've already gained here is that if we release a new version of the Fiddle app, which doesn't change anything in the chunk that contains monaco, webpack should give it the same URL fingerprint, and the browser will be able to continue to use a locally cached version even through an update to the application code!

The round trip time to Fastly from most places on Earth is pretty low, and these assets are all cached by Fastly, so we can see that most of the time needed to get the extra chunks of JS is the download time. Let's focus in on that whopping 701kB chunk where Monaco gets loaded:

Taking 97ms to download 700kB data is not too shabby for a home broadband connection in 2020 (fiber hasn't quite made it to me yet!), but you can see just how much the volume of data and download time matters when latency is low.

Incidentally, that 700kB on the wire is significantly larger when uncompressed. Before you consider code-splitting, do make sure you are compressing your responses. If you're using Fastly, you can enable this in the web interface with a click or use beresp.gzip and beresp.brotli in VCL. All of Fiddle's resources were already compressed, so I get a gold star for that at least.

To summarize, we are taking advantage of low latency between the client and Fastly to load the minimum amount of JavaScript needed to render the page and keep the download transfer time low. Then, while rendering the UI, we can download the additional data lazily.

Load it earlier with prefetch?

Getting slightly obsessed with this, my gaze was drawn to the small gap between loading the now-much-smaller initial render resources, and the point when we kick off loading for Monaco's whale of a module. Here is that 40ms gap blown up in the performance tab of Chrome devtools:

This is a period of high CPU usage but low network activity. Google Analytics chooses this moment to load, but surely we could be loading Monaco here too?

This seems like a nice use case for Link rel=prefetch, the slightly less stylish cousin of preload, but alas I found that prefetch does not trigger the script to load any earlier. Preload certainly does, but that results in our 1MB monster loading alongside the render-critical modules, which slows down the first load. No dice.

We could trigger Monaco to load in code, but ultimately, well, we're talking about saving 40ms and the spirit of JIRA is looking at me disapprovingly and pointing at a list of 1,000 tickets I haven't finished yet.

Use an alternative for read-only cases

In any case, the big win is to be found in not loading Monaco at all, which we should be able to do for the embedded version of Fiddle where the code is never editable. The only thing we need for read-only code is line numbering and syntax highlighting, so PrismJS fits the bill there nicely and comes in at only 15kB. I made a CodeBlock component that renders the code using Prism and then split the rendering logic based on whether the fiddle was in an embedded view or not:

{isEmbedded ? (
  <CodeBlock code={vclSources} />
) : (
  <Suspense fallback="Loading...">
    <VclFunctionSet />
  </Suspense>
)}

Now let's look at the JavaScript that is fetched when Fiddle is loaded in embedded mode:

A thing of beauty. Sure, I could switch to preact or render the initial view of the embedded fiddle on the edge server and barely load any JavaScript at all until the user interacts with it, but we're down to a shade over 100KB now where we were comfortably over 1MB before!

Conclusions

Modern web development often puts a lot of emphasis on latency and avoiding sequential, blocking fetches. This is absolutely right. But if you know you can rely on low latency, then speed becomes more an issue of choosing to download less data.

Code-splitting can be an effective way to do this while keeping the complete application fully cached at the edge, waiting in the wings for remaining modules to be downloaded as soon as they are needed.

Happy to say I'm now practising what I'm preaching. Go try writing a fiddle and let me know if it's fast enough for you!