---
title: Regular expressions in Fastly VCL
summary: null
url: https://www.fastly.com/documentation/reference/vcl/regex
---


Fastly VCL supports regular expressions as an operand to the `~` comparison [operator](/reference/vcl/operators) and also as parameters to the following functions:

* `regsub`
* `regsuball`
* `querystring.regfilter`
* `querystring.regfilter_except`

We support [PCRE](https://www.pcre.org/current/doc/html/) expressions, with some minor exceptions, using the PCRE2 library. Expressions are evaluated at compile time, so they cannot be dynamic nor converted from any other type. Where an operator or function expects a regex, **you must provide a literal pattern in your code**. See [regular-expressions.info](https://www.regular-expressions.info/quickstart.html) for a good introduction to pattern matching using regex.

> **HINT:** You can use the freely available [Regex101](https://regex101.com/) tool to test out your patterns, but be aware of potential minor syntax differences with Fastly's engine. [Fastly Fiddle](https://fiddle.fastly.dev) can be used to test expressions on the actual Fastly platform.

## Example usage

Regular expressions are a very common way to match path prefixes or segments in VCL, along with many other use cases:


| Example usage|Description|
|-------------|-----------|
| `var.my_str ~ "foo"` | Variable contains "foo" |
| `req.url.path ~ "^/admin(/.*)?\z"` | URL path starts with `/admin` segment |
| <code>req.url.ext ~ "^(jpe?g&#124;png&#124;gif)\z"</code> | File extension match |
| `req.url.path ~ "^/bins/([0-9a-f]+)"` | Path slug with hex encoding |
| `req.url.path ~ "^/([^/])+/foo"` | Path segment containing any character except `/` |
| `req.http.host ~ "^www\."` | Hostname starting with `www.` |


## Capture groups and replacement

Every time a regular expression is evaluated as part of a conditional expression involving the `~` operator, the `re.group.{N}` variables will be populated with the matched text and any capturing subgroups in the order that they are matched:

```vcl
if (req.url.path ~ "/products/(uk|us|au|jp)/(\d+)") {
  set req.http.product-region = re.group.1;
  set req.http.product-id = re.group.2;
}
```

If the regular expression contains named captures, they may be accessed using `re.group.{NAME}` with an identifier identical to the one given in the pattern. This can be beneficial when there may be future changes to a complex pattern that could affect numbering.

```vcl
if (req.url.path ~ "/products/(?<region>uk|us|au|jp)/(?<product>\d+)") {
  set req.http.product-region = re.group.region;
  set req.http.product-id = re.group.product;
}
```

The `regsub` and `regsuball` functions take a *replacement* parameter whose value is used to replace pattern matches in the source data. These replacement values may include references to the capture groups in the pattern using a `\{n}` syntax:

```vcl
// /12345-blue-unisex-stripe-t-shirt => /products/12345
set req.url = regsub(req.url, "^/(\d+)\-\w+\z", "/products/\1");
```

Regular expressions used as function parameters, as in `regsub`, don't populate or affect the value of the `re.group.{N}` capture variables.

Use `(?: ... )` to prevent the grouping meta-characters `()` from capturing into a `re.group.{N}` variable. This usage is preferred whenever possible, for efficiency reasons.

## Pattern modifiers

Fastly VCL doesn't provide a way to set regex modifiers outside of the pattern, but they can be prefixed to the pattern using the `(?_)` syntax. We support standard PCRE2 modifiers. The most common ones used in Fastly customer code are:

* `(?i)`: **ignore case**. Makes the pattern case insensitive.
* `(?s)`: **dot all**: Allows the `.` to match any character, *including newlines*.
* `(?m)`: **multi-line**: Makes `^` and `$` match at the beginning and end of lines (`\z` continues to only match the end of the string)

## Text encoding and multi-byte characters

VCL source code, including any contained regular expressions, is interpreted as UTF-8, which means that one character of text can be a variable number of bytes. It is possible to match multi-byte characters using regex, but the regex parser will see only sequences of bytes, not characters or [code points](https://en.wikipedia.org/wiki/Code_point). The following patterns will all match a 👋 (waving hand emoji), which is represented by a 4-byte sequence, at the start of a string:

* `^👋`
* `^....`
* `^\xF0\x9F\x91\x8B`

Notice that a single `.` will *not* match a multi-byte character (because `.` matches *one byte*, not one character), and multi-byte characters in URL paths (along with anything else that is not [RFC3966](https://tools.ietf.org/html/rfc3986#section-3.3) compliant) will be automatically URL-encoded. So, when matching on `req.url`, use the encoded form or pass through `urldecode` first. See the following examples:

```vcl
if (req.url ~ {"^/foo/👋"}) { ... }                           // No match
if (req.url ~ {"^/foo/%F0%9F%91%8B"}) { ... }                // Matches
if (urldecode(req.url) ~ {"^/foo/\xF0\x9F\x91\x8B"}) { ... } // Matches
if (urldecode(req.url) ~ {"^/foo/👋"}) { ... }                // Matches
if (urldecode(req.url) ~ "^/foo/%F0%9F%91%8B") { ... }       // Matches
```

Complicating this, in VCL regular expressions are expressed using [STRING](/reference/vcl/types/string) syntax, which means URL-escape notation (e.g., `"%20"`) is transformed by the string type and not by the regex engine. As a result, the final example above matches because the `req.url` on the left starts out with the emoji in encoded form but it is decoded by `urldecode` and, on the right side, the URL encoding is decoded by the STRING type.

As a result, we recommend that any regular expression that includes URL-escape notation should be expressed as a *long string* (e.g., `{"%20"}`). The long string notation does not decode URL escape notation, so it will be passed to the regex engine unmodified.

Since the regex engine has no concept of a multi-byte character we do not support `\uXXXX` notation for unicode escapes.

## Best practices and common mistakes

Here is some of our most common advice to customers who are writing regular expressions in VCL:

* **Anchor the pattern**: Often you will want to find a match at the beginning or end of a URL path or hostname. Don't forget to include `^` at the beginning or `\z` at the end, otherwise you may find a match anywhere in the string.
  > ✅ `^web\d+\.example\.com\z`&emsp;&emsp;&emsp;❌ `web\d+\.example\.com`

* **Prefer `\z` over `$`**: `\z` always matches the end of the string. `$` will also match a trailing newline at the end of the string, so if you use this in combination with capturing groups, you may not be capturing what you expect. Also, `\z` is more efficient, so it is better to use it in places where `\n` cannot appear.
  > ✅ `req.url ~ "/foo\z"`&emsp;&emsp;&emsp;❌ `req.url ~ "/foo$"`

* **Escape dots**: The `.` pattern matches any character, so remember to escape it if you want to match a dot:
  > ✅ `example\.com`&emsp;&emsp;&emsp;❌ `example.com`

* **Don't escape slashes**: In some languages regular expressions are bounded by a delimiter character, commonly a slash (e.g., `/abc/`). This isn't the case in VCL and there's therefore no need to escape forward slashes:
  > ✅ `/foo/bar`&emsp;&emsp;&emsp;❌ `\/foo\/bar`

* **Don't use `regsub` for extraction**: To extract a substring into a variable, use the `if` function. If you use regsub, and there is no match, you would assign the full source string to the target variable, which probably isn't what you want.
  > ✅ `set var.lang = if(req.url ~ "^/(\w{2})/", re.group.1, "en");`<br/>
  > ❌ `set var.lang = regsub(req.url, "^/(\w{2})/.*\z", "\1");`

* **Use *long strings* to avoid double encoding**: [Strings](/reference/vcl/types/string) expressed in VCL using double quotes (e.g., `"foo"`) automatically decode URL-escape sequences, such as `%20` (which is a space). To ensure characters are processed by the regular expression parser and not by the string parser, use a *long string*. For example:
  > ✅ `req.url ~ {"/%2ehidden"}`&emsp;&emsp;&emsp;❌ `req.url ~ "/%2ehidden"`

* **RFC3986-non-compliant URLs get *URL-encoded***: If Fastly receives a URL path containing characters not allowed in [RFC3966](https://tools.ietf.org/html/rfc3986#section-3.3), we will URL encode them, which means a regex that attempts to match the original form will fail. Use a case-insensitive regex in a long-string to match the URL-encoded version:
  > ✅ `req.url ~ {"(?i)^/foo/%3C%%20\w+%20%%3C"}`<br/>
  > ❌ `req.url ~ "^/foo/<% \w+ %>"`

* **Don't use regular expressions to match query parameters**: It's easy to make a mistake when trying to match or filter a query string parameter with a regular expression, but VCL has a whole set of [query string-related functions](/reference/vcl/functions/query-string) to help with these use cases.
  > ✅ `set req.url = querystring.filter(req.url, "foo");`<br />
  > ❌ `set req.url = regsub(req.url, "([?&])foo=[^&]*&?", "\1");`

* **Use non-capturing groups when possible**: These are more efficient. You can make a group non-capturing by prefixing it with `?:`:
  > ✅ `if (beresp.http.Cache-Control ~ "(?:private|no-store)") {`<br/>
  > ❌ `if (beresp.http.Cache-Control ~ "(private|no-store)") {`
