Rewriting HTML with the Fastly JavaScript SDK

Staff Software Engineer, WebAssembly

Customizing webpages for the viewer can require rewriting HTML retrieved from another source. Perhaps we make a request to a service owned by another team and need to selectively modify parts of the response. Maybe we want to cache the static parts of the page on the edge, closer to users, and perform the customization at a physical location that minimizes request latency. The Fastly Compute JavaScript SDK now comes with a streaming HTML rewriter that can achieve these goals with an ergonomic API that fits into existing web standards and outperforms general-purpose server-side DOM manipulation solutions.
The HTML rewriter is available in version 3.35.0 of the JS SDK, and documentation is available in the @fastly/js-compute docs.
How the HTMLRewritingStream API Works
The JS SDK provides an HTMLRewritingStream type that lets you register rewriting callbacks on CSS selectors. When the rewriter encounters an element matching the selector, it calls the registered callback. This callback can manipulate the attributes of the element and add or remove content from the immediate context. After creating an instance of this type, you can pipe an HTML stream through it. For example, if we want to prepend the text “Header: “ to any h1 tags and add an attribute to div tags, we could write this:
/// <reference types="@fastly/js-compute" />
import { HTMLRewritingStream } from 'fastly/html-rewriter';
async function handleRequest(event) {
let transformer = new HTMLRewritingStream()
.onElement("h1", e => e.prepend("Header: "))
.onElement("div", e => e.setAttribute("special-attribute", "top-secret"));
let body = (await fetch("https://example.com/")).body.pipeThrough(transformer);
return new Response(body, {
status: 200,
headers: new Headers({
"content-type": "text/html"
})
})
}
addEventListener("fetch", (event) => event.respondWith(handleRequest(event))); You construct the HTMLRewritingStream with two callbacks: one for H1 tags and one for divs. You make an HTTP request to retrieve an HTML page and pipe the response through the rewriting transformer. Finally, you return a response containing the rewritten body.
HTMLRewritingStream is a type of TransformStream — a part of the Streams API web standard. This has two main benefits. Firstly, HTML is processed in a streaming fashion where the document is received, processed, and sent to the client in fragments rather than having to wait for the entire document to be transmitted and transformed. Secondly, the rewriter fits into existing pipelines based on TransformStream, so if you have custom streaming transformations that you want to perform before or after the HTML rewrite, you simply chain another pipeThrough call, like this:
body.pipeThrough(aCustomTransformer)
.pipeThrough(htmlRewriter)
.pipeThrough(anotherCustomTransformer); The API follows Akamai’s html-rewriter interface with the exception of the insert_implicit_close option, which is not supported, and the addition of an escapeHTML option, which enables the inclusion of HTML as text in rewritten elements. The implementation uses Cloudflare’s lol-html Rust crate to perform on-the-fly parsing and rewriting.
Performance Benchmarks: Fastly HTMLRewritingStream vs LinkeDOM
Due to the underlying streaming-based low-latency parsing and rewriting infrastructure, the HTMLRewritingStream greatly outperforms pure JS libraries such as LinkeDOM. The graph below shows processing time in milliseconds for rewriting a number of image tags along with their labels and descriptions:

Note the logarithmic axes. This shows the processing time for HTMLRewritingStream as ~20x as fast as that of LinkedDOM. Measures were taken with a local execution on an M3 MacBook Pro, averaged over 20 runs.
You can see the difference in action in this small demo, which rewrites images with a picture of Nicholas Cage (note that selecting higher image counts for the LinkeDOM implementation may produce CPU timeouts).
Conclusion: High-Performance HTML Rewriting at the Edge with Fastly Compute
The HTML rewriter feature of the Fastly JavaScript SDK is now available in version 3.35.0. It gives you an ergonomic, performant way of rewriting HTML that fits into existing web standards. Please give it a try and let us know how it is helping your Compute workloads.