Beacon termination

Your website includes JavaScript on the client side that generates analytics, and you want to collect this data, but want to avoid an uncacheable request reaching your servers for every page view. Fastly's real-time logging can help.

Illustration of concept

Beacons are HTTP requests, normally POSTs, sent from a web browser to record some analytics data. When these requests first started to be used on the Web, they would masquerade as 1x1px images, but these days browsers offer native support for beacons via the navigator.sendBeacon method, and native apps will also now often send beacon data back to base using a similar POST request.

Data collected from beacons has myriad uses: business metrics, customer behavior modelling, performance monitoring, and so on. Sometimes you want to see it in real-time, and for other use cases you can more efficiently analyse it in batches at the end of the day. Some data benefits from being treated as discrete events, other data looks more like time series. In each case the frequency of delivery and the tool you're going to use to store and process it needs to match your use case.

Fastly supports a wide range of third parties that will ingest your log data. Using these, you can divert beacon request payloads to a log endpoint, and avoid putting load on your own infrastructure.

Instructions

Configure an inert log endpoint

Start by creating a log endpoint to which you will send your log data. By default, Fastly will log every request, so you need to take care to configure the log endpoint to be inactive - then you can invoke it explicitly from your code.

Choose from our list of supported logging providers. The logging provider you choose will be a function of how you want to analyse the data. For example, if storing data for later analysis, choose static bucket storage like Amazon's S3, Google's Cloud Storage, Azure Blob Storage or DigitalOcean Spaces. These generally offer low prices per gigabyte and aren't fussy about how you format your data.

If you want instant access to the data as it comes in, and want to model your data as discrete events, use a provider like Splunk, New Relic or Papertrail.

You can even go directly to records in a general purpose database by connecting Google's BigQuery. In that case, you should pay close attention to formatting, since BigQuery is fussy about accepting only JSON formatted data and only a subset of date formats.

There are two ways to set up a log endpoint:

  1. Use the Fastly management UI and follow the instructions for setting up a log endpoint
  2. Create a log endpoint via a POST request to our API

To make the endpoint inert (in other words, to disable automatic logging to the endpoint), set the placement property to "None" when creating the endpoint via the API. There is also a "Placement" field in the web interface that can be used to select "None".

Name your log endpoint something that you can remember; you'll need to reference it from your code later.

Intercept applicable logging requests

Decide on the path that you will use to receive log data. We can most conveniently process data that comes as a query string, so to allow for that, don't use req.url to match the request path, since it contains the query string. Instead, you can more conveniently check for your desired path against req.url.path:

sub vcl_recv { ... }
Fastly VCL
if (req.url.path == "/log") {
error 618 "beacon:log";
}

Since you do not intend to send this request to your origin server or use a cached object, you need to treat it as an error. This allows you to transfer control of the request from the vcl_recv subroutine to the vcl_error subroutine, skipping the cache lookup. We recommend that when triggering a custom error, you should use a status code in the 6xx range, which we guarantee will never be reserved by Fastly, and is also not standardized as part of HTTP. Including an additional custom 'response status text' ("beacon:log" in the example here) further ensures that you will not conflict with any other code you include in your configuration that might use the same error number.

Intercept the error and extract the data to be logged

You can now create the code in the vcl_error subroutine that you need, to intercept the error thrown from vcl_recv.

sub vcl_error { ... }
Fastly VCL
if (obj.status == 618 && obj.response == "beacon:log") {
# ... later steps in the tutorial will add code here
}

When an error is thrown, an obj variable is created that represents a synthetic HTTP response, and will acquire the status code and response status text defined when it was thrown. It's worth matching both the error status code and the response status text, to ensure that you don't conflict with other errors.

If your incoming data is on the query string of the URL, then it can already be accessed via req.url.qs.

If you use navigator.sendBeacon in your frontend JavaScript to create the request, it will have been sent as a POST request, but can still carry the data on the query string, and that's the easiest way to make it available to work with in VCL:

JavaScript
let payload = new URLSearchParams();
payload.append("loadTime", 43.2);
navigator.sendBeacon("/log?" + payload.toString());

However, in JavaScript, navigator.sendBeacon can also send a payload in the body of the request, and the format it sends it in will depend on which type of object you pass to the sendBeacon function:

  • sendBeacon(_string_, _URLSearchParams_): If the payload is a URLSearchParams then the content type will be application/x-www-form-urlencoded (except in Chrome, due to a bug, it will have a Content-type: text/plain but the payload will still be URL-encoded form data, i.e. a query string).
  • sendBeacon(_string_, _FormData_): If the payload is a FormData then the content type will be multipart/form-data (this is not recommended, as no parser exists in VCL).
  • sendBeacon(_string_, _string_): If the payload is a plain string (such as the output of JSON.stringify) then the content type will be text/plain.
  • sendBeacon(_string_, _Blob_): If the payload is a Blob then the content type will be the type specified in the Blob.

All in all, and especially considering the bug in Chrome, we recommend constructing a URLSearchParams and simply serializing it on the end of the beacon URL as shown above.

Dealing with JSON input

If not a query string (URLSearchParams), the next most likely format you might want to use for your beacon is JSON. If you are dealing with JSON beacons, you might find it easier just to log the data directly from req.body.

WARNING: Both query strings and request body content are subject to limits on the maximum length we can process. See our resource limits documentation for details.

Enrich the data (optional)

The overall goal of this solution is to log the data you're receiving from the client (browser or native app), but Fastly has a wealth of additional data about the request that you might want to capture as well. Take a look at our VCL variables to understand what is available. Here we'll show some examples of the most popular data that people tend to like to capture at this point:

sub vcl_error { ... }
Fastly VCL
set req.url = querystring.add(req.url, "clientCountry", client.geo.country_code);
set req.url = querystring.add(req.url, "timestamp", strftime({"%Y-%m-%dT%H:%M:%SZ"}, time.start));
set req.url = querystring.add(req.url, "pop", server.datacenter);
set req.url = querystring.add(req.url, "autoSystemNumber", client.as.number);
set req.url = querystring.add(req.url, "clientIP", req.http.fastly-client-ip);

With query string data, you can take advantage of our suite of query manipulation functions to add extra parameters to the query. In this case, querystring.add is used to add geolocation and IP information about the client.

Alternative: Enriching JSON

If you are dealing with incoming JSON in req.body, then you can splice the client-supplied data, and the Fastly variables, into a string variable:

sub vcl_error { ... }
Fastly VCL
declare local var.jsonData STRING;
set var.jsonData = regsub(req.body, "\}$", ", ") +
{""timestamp": ""} + strftime({"%Y-%m-%dT%H:%M:%SZ"}, time.start) + {"", "} +
{""clientCountry": ""} + client.geo.country_code + {"", "} +
{""pop": ""} + server.datacenter + {"", "} +
{""autoSystemNumber": "} + client.as.number + {", "} +
{""clientIP": ""} + req.http.fastly-client-ip + {"""} +
{"}"};

Here you are using regsub to remove the final } from the incoming JSON and replacing it with a comma to allow the addition of more properties. Because JSON syntax makes use of double-quotes and VCL strings are double-quote delimited, use long string delimiters ({"..."}), which support literal double-quote characters within strings.

Dispatch the data to your log endpoint

Using the name you chose for your log endpoint earlier, form a log statement to send the data:

sub vcl_error { ... }
Fastly VCL
log "syslog " req.service_id " logger_name :: " req.url.qs;

In place of req.url.qs, substitute req.body if you want to pass through the unmodified POST data, or var.jsonData if you opted to construct a custom JSON object in the previous step.

WARNING: Whichever format you use to serialize the log data, be aware that each log event must be on one line, so cannot contain any newline characters. Log events containing newline characters will be truncated at the end of the first line.

Construct a response to the client

Even though we don't really care about the response, and the browser will ignore it anyway, we should construct a response that's somewhat more appropriate that simply allowing an HTTP 618 beacon:log response to go to the browser, because technically that's invalid HTTP.

sub vcl_error { ... }
Fastly VCL
set obj.status = 204;
set obj.response = "No content";
set obj.http.cache-control = "no-store, private";

The most appropriate HTTP response status code is 204, meaning "No content", and for compatibility with HTTP/1.1, you should also set the response status text (from HTTP/2 on, the response status text is no longer included in the response).

It's also important, in case your client consults a cache before making the beacon request, to ensure that the response is not cacheable. If you are using navigator.sendBeacon, the browser will not consult the cache, but it makes sense to ensure that nothing caches this response in any case.

Deliver the response

The final step within the vcl_error subroutine is to tell Fastly to terminate processing of vcl_error and move to the vcl_deliver stage.

sub vcl_error { ... }
Fastly VCL
return (deliver);

It's a good idea to do this because otherwise subsequent code in the vcl_error subroutine might further modify the response obj.

Your complete vcl_error subroutine should now look like this (assuming you are working with query strings):

sub vcl_error { ... }
Fastly VCL
if (obj.status == 618 && obj.response == "beacon:log") {
if (std.strlen(req.body) > 0 && req.method == "POST" && req.http.Content-Type ~ "^application\/x-www-form-urlencoded") {
set req.url = req.url + if (req.url ~ "\?", "&", "?") req.body;
}
set req.url = querystring.add(req.url, "clientCountry", client.geo.country_code);
set req.url = querystring.add(req.url, "pop", server.datacenter);
set req.url = querystring.add(req.url, "autoSystemNumber", client.as.number);
set req.url = querystring.add(req.url, "clientIP", client.ip);
log "syslog " req.service_id " logger_name :: " req.url.qs;
set obj.status = 204;
set obj.response = "No content";
set obj.http.cache-control = "no-store, private";
return (deliver);
}

Check your log collector

Once you have activated the solution in your service, send some requests to the log path and ensure that you receive a 204 response.

The data should now be flowing to your chosen third party logging provider, but if not, log in to your service in the Fastly management UI to see if any logging errors are being reported. If we are unable to send data to your log endpoints, the error is shown at the top of the configuration page for your service, or alternatively, load the logging status API endpoint to see the status directly:

$ curl "https://api.fastly.com/service/{SERVICE_ID}/logging_status" -H "Fastly-Key: {FASTLY_API_KEY}"

In particular, note the LastError property for information on why we're unable to send log data to your endpoint. Common reasons for failure of log endpoints to receive data include:

  • Your logging provider is experiencing technical problems
  • The data is not valid and is being rejected by the logging provider's ingestion system
  • Your service didn't emit any log events

Next steps

Regardless of how you are receiving the input in this solution, we are not doing any validation. Consider copying valid properties individually from the input to an output variable, which will allow you to both ensure that the formatting is valid, and also to ensure that the value is within allowed bounds. You can also thus discard any excess data that is included in the input but not recognized as a known field name.

See also

VCL Reference

Guides

Blog posts:

Quick install

The embedded fiddle below shows the complete solution. Feel free to run it, and click the INSTALL tab to customize and upload it to a Fastly service in your account:

Once you have the code in your service, you can further customize it if you need to.

All code on this page is provided under both the BSD and MIT open source licenses.