What is directory traversal?

Directory traversal, also known as “path traversal” (and identified with CWE-22), is a web application vulnerability that enables attackers to access unintended files on an underlying filesystem. Depending on how and where the traversal occurs, this could enable an attacker to read or write arbitrary files on the web server, possibly enabling the attacker to read sensitive data or files, modify application data, or take full control of the web server.

Traversal vulnerabilities are typically described based on whether they enable reading files or writing files. We’ll demonstrate the impact each can have in the following subsections.

Directory traversal allowing arbitrary file reads

Consider an application that allows a user to store photos and later retrieve them with a GET request using a filename parameter to specify which file to retrieve. If the application does not include protections against path traversal and builds the file path using the provided filename parameter, it could be possible to retrieve arbitrary files from the underlying web server. Figure 1 shows this sequence in practice:

Even if the application is vulnerable to the outlined traversal, there are still limitations like application permissions to consider. For example, if a model of least privilege is applied that limits the application's access on the web server, its permissions may limit the files the attacker can access with the vulnerability. This is one reason traversal payloads frequently use /etc/passwd as their target — the file needs to be readable by all users. There are other ways to sandbox and limit application access, which we’ll discuss further in the prevention section.

Directory traversal allowing arbitrary file writes

Consider the same photo-storing application, but it now allows the user to name each of the photos they’re storing. When saving the file to disk, the application uses the provided name to build the file path for the photo file. Unless there are sufficient protections, an attacker can add traversal sequences (e.g., ../) to the provided name, controlling which directory the file is stored in.

While this may seem innocuous since it’s just storing a photo, there are ways to further the exploitation. If the file type is not validated (e.g., JPEG, PNG), an attacker could upload any file type, leading to several different exploitation scenarios such as:

Adding the attacker’s public key to a user’s authorized_keys file (e.g., /root/.ssh/authorized_keys) to gain persistent access
Overwriting application files to modify application behavior
Uploading a web shell within the web root
Causing a denial of service by overwriting needed system files
Uploading executable files (e.g., malware, ransomware)

Depending on the access level of the application, the impact of an arbitrary file write from a directory traversal can be devastating.

Real examples of directory traversal

With the basics of Directory Traversal covered, let’s look at some recent exploitation of this vulnerability in the real world.

CVE-2023-2825: Directory traversal in Gitlab

In Gitlab version 16.0.0, there’s a directory traversal vulnerability that allows for arbitrary file reads. Uploading a file as an attachment to an Issue in a default Gitlab installation causes Gitlab to store the file 10 directories deep in a pattern as shown below:

/var/opt/gitlab/gitlab-rails/uploads/@hashed/<directory>/<directory>/<directory>/<directory>/<filename>

After uploading, Gitlab also provides an endpoint to retrieve the uploaded file at

/<repo-name>/uploads/<file-id>/<filename>

In a request to that endpoint, Gitlab does not sanitize or validate the filename parameter, which permits a directory traversal attack. To exploit the traversal, the repository must be nested within at least 5 groups, with the number of groups directly correlating to the amount of directories you can traverse using the vulnerability.

In a standard installation, this means you need to nest the repository in 11 groups to be able to reach the root of the filesystem. In this scenario, an attack payload to retrieve the /etc/passwd file could look like the following:

GET /Group-1/Group-2/Group-3/Group-4/Group-5/Group-6/Group-7/Group-8/Group-9/Group-10/Group-11/<repo-name>/uploads/<file-id>/..%2f..%2f..%2f..%2f..%2f..%2f..%2f..%2f..%2f..%2f..%2f..%2fetc%2fpasswd

Unauthenticated attackers can only exploit the vulnerability if a public repository fulfills the nested groups requirement. This is unlikely, making exploitation more likely from an authenticated user with privileges to create nested groups and repositories to satisfy the exploitation requirements. A full exploit chain can first create the needed groups and repository, upload a file, and then exploit the traversal to read arbitrary files, as seen in the PoC here.

CVE-2022-48362: directory traversal in ManageEngine Desktop Central

In ManageEngine Desktop Central builds prior to 10.1.2127.1, a directory traversal in file upload functionality allowed for arbitrary file writes by manipulating the “computerName” parameter (or a few others) to include traversal sequences.

At the time of its discovery, this vulnerability was actively exploited and combined with an authentication bypass (CVE-2021-44515) to enable remote code execution. For brevity, we’ll focus on the path traversal portion of the exploitation, as it enables a file write and bypasses loose validation.

In a function named “doPost”, Desktop Central handles several parameters as part of a file upload, including “computerName” and “filename”, among others. From this function, two vulnerabilities arise which lead to a successful directory traversal that enables file writes:

Only the filename parameter is checked for traversal sequences. Other parameters such as “domainName”, “computerName” or “customerId” are used to build the absolute path for the file, but are not checked for traversal sequences. The “computerName” is the last parameter used in the concatenated string, so would be the ideal place to enter a traversal sequence because there won’t be additional content appended to it.
The file upload permits files with extensions of zip, 7z, and gz. At first glance, maybe this seems safe. However, because Desktop Central is a Java application, and JAR files are built on the zip format, this enables remote code execution. By uploading a zip file to the C:\Program Files\DesktopCentral_Server\lib directory and forcing a restart, a savvy attacker can overwrite class files in the application to include their own code and gain code execution.

This vulnerability highlights the severe impact a traversal vulnerability can enable, while also showing how an incomplete prevention can unintentionally permit directory traversal.

Directory traversal as seen by WAFs

Directory traversal is one of the most commonly observed attack techniques, as highlighted in our Network Effect Threat Report from the second quarter of 2023.

This could be for several reasons including the severity of the impact if successful or the size of payload lists that attackers and scanners may use. For example, a snippet from one of the PayloadsAllTheThings directory traversal lists attempts to read the same file with varying depths of traversal, as shown below:

../../../../../../../../../etc/passwd
../../../../../../../../etc/passwd
../../../../../../../etc/passwd
../../../../../../etc/passwd
../../../../../etc/passwd
../../../../etc/passwd
../../../etc/passwd

In most cases, an attacker likely does not know where the application is located on the filesystem when testing for a traversal vulnerability. However, applications will not go past the system root (i.e., /), so attackers utilize longer sequences of “../” to make sure the root is reached while also including shorter sequences in case longer sequences are blocked or break application functionality.

Other attack types, such as Cross-Site Scripting (XSS), can use payload polyglots that combine multiple techniques into a single payload. This leads attackers and scanners to send many more payloads to test for traversal than when testing for XSS. Our example file from PayloadsAllTheThings contains 140 payloads, compared to their XSS_Polyglots file which has 16. The largest payload file for traversal in PayloadsAllTheThings has over 21,000 entries, compared to the largest for XSS at over 600, SQLi at over 400 (though one should assume this would be larger in practice if the database type is unknown, since multiple types would need to be tested), and over 400 for OS command injection.

While the size of traversal payload lists offers one hypothesis for the widely observed traversal attack technique, the severity of the impact may also cause the technique to be used more frequently. , as demonstrated in the above discussion of CVE-2022-48362, which enabled remote code execution and was known to be exploited at the time of discovery.

Preventing directory traversal vulnerabilities

There are several strategies that can be used to prevent traversal vulnerabilities, including design changes to prevent building file paths with user input, strict validation of input, using path canonicalization, and limiting application access. We’ll cover each of these below.

Prevent building file paths with user input

Rather than using user input to build the file path, consider using a corresponding file-id or name to reference the file. Then, map file-ids to their corresponding storage path. When a user requests this file or uploads a file, they can only control its id, preventing the user from ever accessing the contents used to build the file path. This removes the possibility of traversal entirely, as the user’s input is no longer used to build the file path of the retrieved file.

Strict validation of user input

Strict validation is different from sanitization. Rather than sanitizing input by attempting to remove traversal sequences (e.g., ../), which can be bypassed in innumerable ways, validate that only expected content is in user input and reject anything that does not pass the strict validation. Examples of validation that can help prevent directory traversal:

Validating a filename only contains alphanumeric values
Validating only a single “.” character is in the provided filename
Validating unwanted characters are not included in the provided input (e.g., /, \)
Validating the file type of an uploaded file (not by extension, which can be easily bypassed)

Use path canonicalization language features for strict validation

Path canonicalization essentially shortens the file path to its actual path, effectively removing symlinks, ../ sequences, and other symbolic content. After getting a canonical path, you can verify it still starts with your expected base directory (e.g., uploads/photos/). Some examples of functions to get a canonical path are:

Java: getCanonicalPath
PHP: realpath
C: realpath
ASP.NET: GetFullPath

Limit application access

In general, an application should only have access to files and directories it needs access to in order to function properly. This helps in some cases of directory traversal by limiting the impact of the vulnerability if one is discovered. For example, a web application should never be running under the root user, and ideally be restricted to only access the files it needs to serve the application.

Summary

Directory traversal, also known as “path traversal,” is a web application vulnerability that enables attackers to access unintended files on an underlying filesystem. Depending on the traversal vulnerability, this could enable an attacker to read sensitive data or files, modify application data, or take full control of the web server. Traversal vulnerabilities can have severe impacts, and as seen in our real world examples and Next-Gen WAF data, a prominent attack vector. However, applications can prevent path traversal vulnerabilities by following the solutions we’ve outlined. If you’re having difficulty preventing directory traversal, or utilize a product that has suffered from these vulnerabilities in the past, check out Fastly’s Next-Gen WAF to protect against these attacks and more.