Local File Inclusion – Tricks of the Trade

By: Cris Neckar, Andrew Case

Everyone understands that local file includes are bad. The ability to execute an arbitrary file as code is unquestionably a security risk and should be protected against. However, the process of exploitation can be rather involved and is commonly misunderstood. In this post I want to clarify the risks involved in this type of vulnerability and the complications involved in exploitation.

To start lets give a bit of background. I will focus on PHP on Linux specifically but this class of vulnerability may also exist in many other interpreted languages on different platforms. Generically, a file inclusion vulnerability is the dynamic execution of interpreted code loaded from a file. This file could be loaded remotely from an http/ftp server in the case of remote inclusions, or as I will cover, locally from disk. Generally remote file inclusion vulnerabilities are trivial to exploit so I will not be covering them. The following line is an example of a local file inclusion vulnerability in PHP:

require_once($LANG_PATH . ‘/’ . $_GET[‘lang’] . ‘.php’);

In this case an attacker controls the “lang” variable and can thereby force the application to execute an arbitrary file as code. The attacker does not however control the beginning of the require_once() argument, so including a remote file would not be possible. To exploit this an attacker would set the ‘lang’ variable to a value similar to the following:

lang=../../../../../../file/for/inclusion%00

Before we get into discussing the exploitation of this type of vulnerability let me say a few words about preventing them. In the preceding case, the vulnerability could be trivially mitigated through input validation. A simple check for non-alphanumeric characters would suffice in this case. However, where possible I would recommend completely avoiding user input for this type of logic and instead selecting the proper include from a hardcoded list of known good files based on a user supplied index number or hash.

Now that we know how to avoid these when developing applications lets get back to methods of exploitation. A straight forward vulnerability such as this one can in fact be quite difficult to reliably exploit given the differences in deployment platforms. When developing an exploit the first question to ask yourself is generally “what do I need for successful exploitation?”. In this case, the answer to that question is a file stored locally on the target system which contains PHP code that accomplishes our goal. In the best case we will be able to include a file which we directly control the contents of.

This can be an interesting puzzle as it is almost a case of chicken before the egg. To gain access to the remote system we need the ability to create a file on the remote system. The first possibility, and by far the simplest, is to look at the features provided by the application we are attacking. For example, many local inclusion exploits use features such as custom avatars and file storage mechanisms to place code on the target system. Bypassing various checks performed on these types of files/images can be an interesting puzzle in itself and the details are best saved for a future post. However, we want to talk about these vulnerabilities on systems which do not allow such trivial exploitation.

If the target application does not provide some way of uploading or changing a file on disk we need to examine other options. I suggest examining all access to the target server. Ask yourself “What services are available to me and what files to they access? Do I control any of the data written to these files?”. An anonymous FTP server or similar would certainly make life easier here but that would be to good to be true. :)

Generally when people discuss local includes the assumption is that the target file will be the HTTP servers logs. It is quite easy to influence the contents of log files as their purpose is to store the requests that you, the user, make. Most people will suggest the logs and conveniently glaze over the complexities that their usage presents. There are several major potential hurdles in the use of logs.

First we run into the problem of finding the logs. In a production environment it is rare to use default paths for log data, ‘/var/log/httpd/access_log’ is simple enough to guess but what do you do when the log is stored in ‘/vol001/appname.ourwidget.com/log’. Guessing this path would be non-trivial at best and even assuming that verbose error messages or a similar information disclosure gives you some hint to directory structure, using these methods in a reliable exploit would be extremely difficult.

To jump this hurdle lets examine an interesting feature of the Linux proc file-system, namely, the ‘self’ link. The Linux kernel exports a good bit of interesting information about individual processes to usermode through the proc entries for each process id. It also creates an entry called ‘self’ which provides a process easy access to its own process information. When we access /proc/self through the context of a PHP include the link will, in most cases, point to the process entry for the httpd child which has instantiated to PHP interpreter library (This may not be the case if the interpreter has been called as CGI). Often when an HTTPd is run, the path to its configuration file is passed as an argument. If this is the case, finding the log file is a simple procedure of including ‘/proc/self/cmdline’, reading the location of the configuration file and including it to find the path to log files.

Viewing the apache cmdline proc entry

Viewing the apache cmdline proc entry

If the ‘cmdline’ entry does not contain the configuration file there is another option. The per process proc entry also contains a directory entry called ‘fd’. This directory exports a numbered entry for each file descriptor that the process currently holds. In writing PoC for this post on a 2.6.20 kernel we noticed that at some point a kernel developer had the foresight to set the permissions on these entries so that they could be read only by the instantiating user. We tasked intern Y with finding the change and, after grepping the diffs of every kernel release (ever… i.e. wget -r –no-parent http://www.kernel.org/pub/linux/kernel/) he found the following. In May of 2007 the following patch was entered to address the case where a process opens a file and then drops its permissions (Interesting, that is exactly what Apache does).

http://git.kernel.org/?p=linux/kernel/git/torvalds/linux-2.6.git;a=commitdiff;h=8948e11f450e6189a79e47d6051c3d5a0b98e3f3

This change was committed around the release of version 2.6.22. Using this new access we are able to directly access the files opened by the apache process through these proc entries. By iterating the file name from ‘0’ we are able to directly access the HTTPd log file which will undoubtedly be open for use by the web server. Simply include the files ‘/proc/self/fd/<fd number>’ until you hit on the right file descriptor.

Using the fd proc entries to access apache logs

Using the fd proc entries to access apache logs

Great, so we found the log file. Now we need to determine what fields we actually control. It is commonly believed that you can arbitrarily enter text into apache logs by putting code into GET variables or the requested path. This is generally not the case as these values will almost always be URL encoded and will therefore not execute correctly. I find the simplest field to use is actually the user name. This is pulled from the Authorization header which is typically base64 encoded and thereby avoids the URL encoding mechanisms. By base64 encoding a string similar to the following:

<?php passthru($_GET[‘cmd’]); ?>:doesn’t matter

And specifying an Authorization header as follows:

Authorization: Basic PD9waHAgcGFzc3RocnUoJF9HRVRbJ2NtZCddKTsgPz46ZG9lc24ndCBtYXR0ZXI=

We can insert arbitrary code into the HTTP logs.

PHP code as basic auth username

PHP code as basic auth username

Assuming the stars align and this method is successful there is still one more caveat. On production HTTP servers the logs files tend to be rather large, often in the range of 200mb. Your PHP code is going to be at the very end of the current log file which will likely mean that for each command you run it will take a very long time (depending on connection speed) to wait for the page to display the output from your command. It is possible to use the error_log as this is likely somewhat smaller, but these can still be rather larger than we would hope.

We now have somewhat reliable ways to get code execution, but I hear you saying “There must be a better way”. I am going to present a method which is specific to PHP and somewhat specific to the target application. This is a very common scenario but if it does not fit your needs I hope that it will at least help you to get into the right mindset for this type of exploit development.

PHP provides a mechanism for maintaining session state across requests. Many other languages provide similar interfaces but their internals are sometimes quite different. In the case of PHP a session is started using, logically enough, session_start(). This code can be found is ‘/ext/session/session.c’ within the PHP codebase. Briefly, it checks whether a session cookie was sent with the current request, if not it creates a random session id and sets the cookie within the response to the user. It also registers the global $_SESSION variable which is directly tied to a file stored in the temporary directory. This file will contain any variables set under the session context formatted as a PHP array.

In the case of an application which tracks session information, any setting stored as a string which the user controls could provide an excellent target for a local file include vulnerability. The session file will be named ‘sess_<session_id>’. Although the session_id is a random hash it is trivial to retrieve as it is stored locally in a cookie.

Viewing the PHP session cookie

Viewing the PHP session cookie

In our simple example, our session has a few variables but the simplest to control arbitrarily is a field called ‘signature’. PHP applications tend to store all sorts of interesting things in session variables and there is often a string that you control arbitrarily. In this case by setting our ‘signature’ to PHP code we can gain command execution through this session file.

Putting it all together

Putting it all together

Although the specific methods I have outlined here will not always work in your particular situation I hope that I have at least prompted some interest in the many possibilities for exploiting this type of vulnerability. Regardless of how limited you feel by the platform you are exploiting there is almost always some trick that you can use to get the better of the system.