NeoPI in the Wild

December 20, 2011

By Scott Behrens

NeoPI was a project developed by myself and Ben Hagen to aid in the detection of obfuscated and encrypted webshells.  I recently came across an article about Webacoo shell and a rewrite of this php backdoor to avoid detection from NeoPI.

Webacoo’s developer used a few interesting techniques to avoid detection.  The first technique was to avoid signature based detection by using the function ‘strrev’:

$b=strrev("edoced_4"."6esab")

This bypasses our traditional based signature detection and also lends to a few other techniques to bypass signature based detection.  Another webshell surfaced after NeoPI’s release that uses similar techniques to avoid signature based detection.  Another example could be the following (as seen in https://elrincondeseth.Wordpress.com/2011/08/17/bypasseando-neopi/):

$b = 'bas'.'e64'.'_de'.'code';

We can see that just by breaking up the word, the risk of detection is highly mitigated.  As I suggested at B-Sides, signature based detection is complimentary to the tests in NeoPI but by itself, ineffective.   These methods described above completely thwart this tests effectiveness.

But one thing these techniques must do at some point is actually eval the code.  Webacoo for example uses the following:

eval($b(str_replace(" ","","a W Y o a X …SNIP

By developing a regex that looks for eval and a variable holding the global function, we can flag this file as malicious.  After running this test against a WordPress server with Webacoo’s shell, I observed the following:

 

Figure 1 – Webacoo identified as suspicious in eval test

NeoPI was able to detect the function and flagged it as malicious.  This particular type of use of eval isn’t very common and I have really only seen it used in malware.  That being said functions.php was also flagged so I imagine this test can still have many false positives and should be used to help aid in manual investigation of the files identified.

Another tweak Webacoo’s developer did was insert spaces in-between each character of a base64 encoded string.  The function str_replace() is called to replace each space before the code is base64_decoded  and eval’d.

In order to thwart this particular obfuscation technique, I went ahead and modified the entropy function to strip spaces within the data the function is analyzing. The screenshot below shows a scan against 1752 php files in WordPress and shows the entropy test results as flagging webacoo as potentially malicious. This increased NeoPI’s effectiveness at detecting webacoo.php but is more of a stopgap solution as the attacker can craft other junk characters to lower the shells entropy and index of coincidence. Some additional thought and research is needed on potentially looking for these complicated search and replace functions to determine if the data being obfuscated is malicious.

 

Figure 2 – Test results after modifying entropy code results in higher entropy of webacoo.php

The latest version of the code can be checked out at http://github.com/Neohapsis/NeoPI which includes these enhancements.

As for improving the other tests effectiveness, I am looking into the possibility of identifying base64 encodings without looking for the function name.  This technique may be helpful by building a ratio of upper and lower case characters and seeing if there is a trend with files that use obfuscation.

If anyone has interesting techniques for defeating NeoPI please respond to this post and I’ll look at adding more detection mechanisms to this tool!


“Researchers steal iPhone passwords in 6 minutes”…true…but not the whole story

February 28, 2011

By Patrick Toomey

Direct link to keychaindumper (for those that want to skip the article and get straight to the code)

So, a few weeks ago a wave of articles hit the usual sites about research that came out of the Fraunhofer Institute (yes, the MP3 folks) regrading some issues found in Apple’s Keychain service.  The vast majority of the articles, while factually accurate, didn’t quite present the full details of what the researchers found.  What the researchers actually found was more nuanced than what was reported.  But, before we get to what they actually found, let’s bring everyone up to speed on Apple’s keychain service.

Apple’s keychain service is a library/API provided by Apple that developers can use to store sensitive information on an iOS device “securely” (a similar service is provided in Mac OS X).  The idea is that instead of storing sensitive information in plaintext configuration files, developers can leverage the keychain service to have the operating system store sensitive information securely on their behalf.  We’ll get into what is meant by “securely” in a minute, but at a high level the keychain encrypts (using a unique per-device key that cannot be exported off of the device) data stored in the keychain database and attempts to restrict which applications can access the stored data.  Each application on an iOS device has a unique “application-identifier” that is cryptographically signed into the application before being submitted to the Apple App store.  The keychain service restricts which data an application can access based on this identifier.  By default, applications can only access data associated with their own application-identifier.  Apple realized this was a bit restrictive, so they also created another mechanism that can be used to share data between applications by using “keychain-access-groups”.  As an example, a developer could release two distinct applications (each with their own application-identifier) and assign each of them a shared access group.  When writing/reading data to the keychain a developer can specify which access group to use.  By default, when no access group is specified, the application will use the unique application-identifier as the access group (thus limiting access to the application itself).  Ok, so that should be all we need to know about the Keychain.  If you want to dig a little deeper Apple has a good doc here.

Ok, so we know the keychain is basically a protected storage facility that the iOS kernel delegates read/write privileges to based on the cryptographic signature of each application.  These cryptographic signatures are known as “entitlements” in iOS parlance.  Essentially, an application must have the correct entitlement to access a given item in the keychain.  So, the most obvious way to go about attacking the keychain is to figure out a way to sign fake entitlements into an application (ok, patching the kernel would be another way to go, but that is a topic for another day).  As an example, if we can sign our application with the “apple” access group then we would be able to access any keychain item stored using this access group.  Hmmm…well, it just so happens that we can do exactly that with the “ldid” tool that is available in the Cydia repositories once you Jailbreak your iOS device.  When a user Jailbreak’s their phone, the portion of the kernel responsible for validating cryptographic signatures is patched so that any signature will validate. So, ldid basically allows you to sign an application using a bogus signature. But, because it is technically signed, a Jailbroken device will honor the signature as if it were from Apple itself.

Based on the above descrption, so long as we can determine all of the access groups that were used to store items in a user’s keychain, we should be able to dump all of them, sign our own application to be a member of all of them using ldid, and then be allowed access to every single keychain item in a user’s keychain.  So, how do we go about getting a list of all the access group entitlements we will need?  Well, the kechain is nothing more than a SQLite database stored in:

/var/Keychains/keychain-2.db

And, it turns out, the access group is stored with each item that is stored in the keychain database.  We can get a complete list of these groups with the following query:

SELECT DISTINCT agrp FROM genp

Once we have a list of all the access groups we just need to create an XML file that contains all of these groups and then sign our own application with ldid.  So, I created a tool that does exactly that called keychain_dumper.  You can first get a properly formatted XML document with all the entitlements you will need by doing the following:

./keychain_dumper -e > /var/tmp/entitlements.xml

You can then sign all of these entitlments into keychain_dumper itself (please note the lack of a space between the flag and the path argument):

ldid -S/var/tmp/entitlements.xml keychain_dumper

After that, you can dump all of the entries within the keychain:

./keychain_dumper

If all of the above worked you will see numerous entries that look similar to the following:

Service: Dropbox
Account: remote
Entitlement Group: R96HGCUQ8V.*
Label: Generic
Field: data
Keychain Data: SenSiTive_PassWorD_Here

Ok, so what does any of this have to do with what was being reported on a few weeks ago?  We basically just showed that you can in fact dump all of the keychain items using a jailbroken iOS device.  Here is where the discussion is more nuanced than what was reported.  The steps we took above will only dump the entire keychain on devices that have no PIN set or are currently unlocked.  If you set a PIN on your device, lock the device, and rerun the above steps, you will find that some keychain data items are returned, while others are not.  You will find a number of entries now look like this:

Service: Dropbox
Account: remote
Entitlement Group: R96HGCUQ8V.*
Label: Generic
Field: data
Keychain Data: <Not Accessible>

This fundamental point was either glossed over or simply ignored in every single article I happend to come across (I’m sure at least one person will find the article that does mention this point :-) ).  This is an important point, as it completely reframes the discussion.  The way it was reported it looks like the point is to show how insecure iOS is.  In reality the point should have been to show how security is all about trading off various factors (security, convenience, etc).  This point was not lost on Apple, and the keychain allows developers to choose the appropriate level of security for their application.  Stealing a small section from the keychain document from Apple, they allow six levels of access for a given keychain item:

CFTypeRef kSecAttrAccessibleWhenUnlocked;
CFTypeRef kSecAttrAccessibleAfterFirstUnlock;
CFTypeRef kSecAttrAccessibleAlways;
CFTypeRef kSecAttrAccessibleWhenUnlockedThisDeviceOnly;
CFTypeRef kSecAttrAccessibleAfterFirstUnlockThisDeviceOnly;
CFTypeRef kSecAttrAccessibleAlwaysThisDeviceOnly;

The names are pretty self descriptive, but the main thing to focus in on is the “WhenUnlocked” accessibility constants.  If a developer chooses the “WhenUnlocked” constant then the keychain item is encrypted using a cryptographic key that is created using the user’s PIN as well as the per-device key mentioned above.  In other words, if a device is locked, the cryptographic key material does not exist on the phone to decrypt the related keychain item.  Thus, when the device is locked, keychain_dumper, despite having the correct entitlements, does not have the ability to access keychain items stored using the “WhenUnlocked” constant.  We won’t talk about the “ThisDeviceOnly” constant, but it is basically the most strict security constant available for a keychain item, as it prevents the items from being backed up through iTunes (see the Apple docs for more detail).

If a developer does not specify an accessibility constant, a keychain item will use “kSecAttrAccessibleWhenUnlocked”, which makes the item available only when the device is unlocked.  In other words, applications that store items in the keychain using the default security settings would not have been leaked using the approach used by Fraunhofer and/or keychain_dumper (I assume we are both just using the Keychain API as it is documented).  That said, quite a few items appear to be set with “kSecAttrAccessibleAlways”.  Such items include wireless access point passwords, MS Exchange passwords, etc.  So, what was Apple thinking; why does Apple let developers choose among all these options?  Well, let’s use some pretty typical use cases to think about it.  A user boots their phone and they expect their device to connect to their wireless access point without intervention.  I guess that requires that iOS be able to retrieve their access point’s password regardless of whether the device is locked or not.  How about MS Exchange?  Let’s say I lost my iPhone on the subway this morning.  Once I get to work I let the system administrator know and they proceed to initiate a remote wipe of my Exchange data.  Oh, right, my device would have to be able to login to the Exchange server, even when locked, for that to work.  So, Apple is left in the position of having to balance the privacy of user’s data with a number of use cases where less privacy is potentially worthwhile.  We can probably go through each keychain item and debate whether Apple chose the right accessibility constant for each service, but I think the main point still stands.

Wow…that turned out to be way longer than I thought it would be.  Anyway, if you want to grab the code for keychain_dumper to reproduce the above steps yourself you can grab the code on github.  I’ve included the source as well as a binary just in case you don’t want/have the developer tools on your machine.  Hopefully this tool will be useful for security professionals that are trying to evaluate whether an application has chosen the appropriate accessibility parameters during blackbox assessments. Oh, and if you want to read the original paper by Fraunhofer you can find that here.


Java Cisco Group Password Decrypter

January 31, 2011

By Patrick Toomey

For whatever reason I have found myself needing to “decrypt” Cisco VPN client group passwords throughout the years.  I say “decrypt” , as the value is technically encrypted using 3DES, but the mechanism used to decrypt the value is largely obfuscative (the cryptographic key is included in the ciphertext).  As such, the encryption used is largely incidental, and any simplistic substitution cipher would have protected the group password equally well (think newspaper cryptogram).

I can’t pinpoint all the root reasons, but it seems as though every couple months I’m needing a group password decrypted.  Sometimes I am simply moving a Cisco VPN profile from Windows to Linux.  Other times I’ve been on a  client application assessments/penetrations test where I’ve needed to gain access to the plaintext group password.  Regardless, I inevitably find myself googling for “cisco group password decrypter”.  A few different results are returned.  The top result is always a link here.  This site has a simple web app that will decrypt the group password for you, though it occurs server-side.  Being paranoid by trade, I am always apprehensive sending information to a third party if that information is considered sensitive (whether by me, our IT department, or a client).  I have no reason to think the referenced site is malicious, but it would not be in my best interest professionally not to be paranoid security conscious, particularly with client information.  The referenced site has a link to the original source code, and one can feel free to download, audit, compile, and use the provided tool to perform all of the decryption client-side.  That said, the linked file depends on libgcrypt.  That is fine if I am sitting in Linux, but not as great if I am on some new Windows box (ok, maybe I’m one of the few who doesn’t keep libgcrypt at the ready on my fresh Windows installs).  I’ve googled around to see if anything exists that is more portable.  I found a few things, including a link to a Java applet, but the developer seems to have lost the source code…..  So, laziness won, and I decided it would be easier to spend 30 minutes to write my own cross-platform Java version than spend any more time on Google.

An executable jar can be found  here and the source can be found here.  I am not a huge fan of Java GUI development, and thus leveraged the incredible GUI toolkit built in to NetBeans.  The referenced source code should compile cleanly in NetBeans if you want the GUI.  If you simply want to decrypt group passwords with no dependencies you can run a command line version by compiling the “GroupPasswordDecrypter” class file.  This file has zero dependencies on third-party libraries and should be sufficient for anyone that doesn’t feel compelled to use a GUI (me included).

As a quick example, I borrowed a sample encrypted group password from another server-side implementation .  The encrypted group password we would like to decrypt is

9196FE0075E359E6A2486905A1EFAE9A11D652B2C588EF3FBA15574237302B74C194EC7D0DD16645CB534D94CE85FEC4

Or, if you prefer the command line version

Hope it comes in handy!

 


Virtualization: When and where?

November 17, 2009

We often field questions from our clients regarding the risks associated with hypervisor / virtualization technology.  Ultimately the technology is still software, and still faces many of the same challenges any commercial software package faces, but there are definitely some areas worth noting.

The following thoughts are by no means a comprehensive overview of all issues, but they should provide the reader with a general foundation for thinking about virtualization-specific risks.

Generally speaking, virtual environments are not that different than physical environments.  They require much of the same care and feeding, but that’s the rub; most companies don’t do a good job of managing their physical environments, either.  Virtualization can simply make existing issues worse.

For example, if an organization doesn’t have a vulnerability management program that is effective at activities like asset identification, timely patching, maintaining the installed security technologies, change control, and system hardening, than the adoption of virtualization technology usually compounds the problem via increased “server sprawl.”  Systems become even easier to deploy which leads to more systems not being properly managed.

We often see these challenges creep up in a few scenarios:

Testing environments – Teams can get the system up and running very quickly using existing hardware.  Easy and fast…but also dirty. They often don’t take the time to harden the system or bring it up to current patch levels or install required security software.

Even in the scenarios where templates are used, with major OS vendors like Microsoft and RedHat coming out with security fixes on a monthly basis a template even 2 months old is out of date.

Rapid deployment of “utility” servers – Systems that run back-office services like mail servers, print servers, file servers, DNS servers, etc.  Often times nobody really does much custom work on them and because they can no longer be physically seen or “tripped over” in the data center they sometimes fly under the radar.

Development environments – We often see virtualization technology making inroads into companies with developers that need to spin-up and spin-down environments quickly to save time and money.  The same challenges apply; if the systems aren’t maintained (and they often aren’t – developers aren’t usually known for their attention to system administration tasks) they present great targets for the would-be attacker.  Even worse if the developers use sensitive data for testing purposes.  If properly isolated, there is less risk from what we’ve described above but that isolation has to be pretty well enforced and monitoring to really mitigate these risks.

There are also risks associated with vulnerabilities in the technology itself.  The often feared “guest break out” scenario where a virtual machine or “guest” is able to “break out” of it’s jail and take over the host (and therefore, access data in any of the other guests) is a common one, although we haven’t heard of any real-world exploitations of these defects…yet.  (Although the vulnerabilities are starting to become better understood)

There are also concerns about the hopping between security “zones” when it comes to compliance or data segregation requirement.  For example, typically a physical environment has a firewall and other security controls between a webserver and a database server.  In a virtual environment, if they are sharing the same host hardware, you typically can not put a firewall or intrusion detection device or data leakage control between them.  This could violate control mandates found in standards such as PCI in a credit card environment.

Even assuming there are no vulnerabilities in the hypervisor technology that allow for evil network games between hosts, when you house two virtual machines/guests on the same hypervisor/host you often lose the visibility of the network traffic between them.  So if your security relies on restricting or monitoring at the network level, you no longer have that ability.  Some vendors are working on solutions to resolve intra-host communication security but it’s not mature by any means.

Finally, the “many eggs in one basket” concern is still a factor; when you have 10, 20, 40 or more guest machines on a single piece of hardware that’s a lot of potential systems going down should there be a problem.  While the virtualization software vendors will certainly offer high availability scenarios with technology such as VMware’s “VMotion”, redundant hardware, the use of SANs, etc., the cost and complexity adds up fairly fast.  (And as we have seen from some rather nasty SAN failures the past two months, SANs aren’t always as failsafe as we have been lead to believe. You still have backups right?)

While in some situations the benefits of virtualization technology far outweigh the risks, there are certainly situations where existing non-virtualized architectures are better. The trick is finding that line in the midst of the hell mell towards virtualization.

–Tyler


Hacker Halted 2009

October 21, 2009

As many of you know, Greg Ose and I recently spoke at Hacker Halted 2009 in Miami. We discussed a distributed password cracker that we designed and implemented that utilizes redirected browsers to build a swarm of worker nodes. The method which we demonstrated can be implemented using large numbers of otherwise useless stored cross-site scripting vulnerabilities. The client-side worker was implemented as a Java applet in an injected iframe.

Greg and I also showed several methods which can be used on different platforms to trick the Java virtual machine into continuing execution after a client has closed the page where it is embedded. This can be used to maintain large numbers of workers even when the vulnerable sites are not visited for long periods of time.

The following video shows the administrative interface to DistCrypt where we can add and manage password hashes.

You can view the high quality version here.

You can also view the slides from our presentation on the Hacker Halted website here.


What do Rootkits and Xerxes have in common?

August 25, 2009

By: Cris Neckar and Greg Ose

Just think of your kernel as Greece…

It’s probably not the first thing you expected to hear me say but bear with me. The arms race between rootkit development and detection has been raging for years, and although there have been great advancements in detection, rootkit developers have always had the upper hand. You simply cannot defend against what you cannot predict. Recently the battle has become increasingly heated as rootkit developers dig deeper and deeper. One only need look at the Blackhat Briefings for the past few years to see the trend.

Increasingly subversive and complex rootkits threaten to over-run our kernels and turn our business critical systems into slaves. So what are we to do? Should we simply throw up our hands and resolve ourselves to a life of bondage?

To answer this question lets approach the problem from a different angle. It has become clear that we cannot simply monitor everything a rootkit developer might change. There are just too many function pointers, operating modes, and devices to allow us to watch them all. However, one commonality does exist in all kernel mode rootkits. In some way they must all be inserted into the kernel.

There are a number of ways to introduce new code into a running kernel but not nearly so many as there are places to hide once inside. I propose that, instead of building a blacklist and peeking under all the rocks where they have hidden in the past, we instead monitor the entry point. Leonidas used a similar approach when Xerxes threatened Greece, instead of fighting on even ground the Spartans guarded the narrow door through which the Persians must travel to enter Greece.

So, you say, it’s a great theory, but how do we implement it?  Assuming an attacker has compromised a system’s most privileged usermode context, for instance a process running as root, code can be introduced into the running kernel in a number of ways. The most obvious of these is to install a module or driver and be done with it. Aside from that, an attacker would need to patch kernel memory directly, this is not trivial but is still a common method of rootkit insertion. Aside from these options, the attacker would be forced to edit the kernel binary, or an existing module on disk and either cause or wait for a reboot (There is one additional threat which we will save for later). The interesting thing about all of these potential insertion points is that they are events which would almost never occur in a production server (If they do occur an administrator will certainly be aware of the reason). By monitoring these few events we would be notified of the insertion of any existing kernel mode rootkit that I am aware of, as well as any that could be written in the future (barring a further vulnerability.. more on this later).

To implement this on Linux is actually fairly trivial. To monitor for code inserted into the kernel while running we can use the tricks of rootkit developers ourselves. Namely we can hook sys_init_module (for module insertions), and the write() handlers within the kmem and mem file operations structures (for writes to /dev/(k)mem where these are still possible).

The next step is to defend against changes made to kernel components on disk. For the kernel image itself, on a system which is rarely rebooted such as a production server, it is possible to accomplish this by booting a known good kernel. For example a kernel could be booted from media which is only physically attached to the system when the kernel is initially loaded into memory. A similar approach could be used with required kernel modules, or if preferable, they could be compared on load to known good check sums.

Now that we have covered the entry points we can focus on what to do once one of these events occurs. We do not actually need to prevent the insertion as, in some cases, these events may legitimately occur (once a root compromise has occurred a re-image of the system is generally the best approach anyway). Instead we simply need to securely log the event to a location which is not controlled by the intruder. Obviously we can not log locally. Additionally we cannot log with mechanisms which are already under the attackers control (anything accessible from user mode is out). One relatively trivial solution is to simply create a syslog packet and drop it on the network stack when one of our traps triggers. This will provide us with a remote log on a system which (hopefully) has not been compromised. When implementing this you would of course want to be careful to temporarily disable any netfilter hooks (think iptables) as these could allow the intruder to block our logging packet from usermode.

Like Sparta’s defense at Thermopolae, this method also has its mountain path. I mentioned earlier that there was an additional method for introducing code into a running kernel. This would involve the existence of a vulnerability within the kernel or a loaded module which allows arbitrary memory to be written. An attacker could create an exploit which uses such a vulnerability to introduce a rootkit into the kernel. We cannot possibly predict where an exploit will occur (if we could, vulnerability research would be a lot easier :) ) making it impossible (so far ;) ) to guard against this threat using this method. Fortunately a rootkit which used this installation method would be obsolete the moment it was first detected. This means that this type of rootkit could never reach widespread usage and would most likely be used only in a very targeted attack.


ViewStateViewer: A GUI Tool for deserializing/reserializing ViewState

August 3, 2009

By: Patrick Toomey

Background

So, I was reading the usual blogs and came across a post by Mike Tracy from Matasano (Matasano has been having some technical difficulties…this link should work once they have recovered).  In the blog post Mike talks about the development of a ViewState serializer/deserializer for his WWMD web application security assessment console  (please see Mike’s original post for more details).  Mike noted that the tools for viewing/manipulating ViewState from an application testing perspective are pretty weak, and I can’t agree more.  There are a number of ViewState tools floating around that do a tolerable job of taking in a Base64 encoded ViewState string and presenting the user with a static view of the deserialized object. However, no tools up until this point, save for Mike’s new implementation, have allowed a user to change the values within the deserialized object and then reserialize them into a new Base64 string. Mike’s tool does exactly this, and is immensely useful for web application testing. The only downside to Mike’s tools is that it is built for his workflow and not mine (how can I fault the man for that).  So, I decided to build an equivalent that works well for me (and hopefully for you as well).

I tend to have a web proxy of some sort running on my machine throughout the day. There are tons of them out there and everyone seems to have their personal favorite. There is Paros, WebScarab, BurpSuite, and I am sure many others. In the last few months I have been using a newer entrant into the category, Fiddler.  Fiddler is a great web proxy whose only big drawback is that it is Windows only.  However, at least for me, the upsides to Fiddler tend to outweigh the negatives.  Fiddler has a fairly refined workflow (don’t get me started on WebScarab), is stable (don’t get me started on Paros), and is pretty extensible.  There are a number of ways to extend Fiddler, most trivially using their own FiddlerScript hooks.  In addition, there is a public API for extending the application using .NET.  Fiddler has a number of interfaces that can be extended to allow for inspecting and manipulating requests or responses. Please see the Fiddler site for more details on extending Fiddler using either FiddlerScript or .NET.  In particular, take a look at the development section to get a better feel for the facilities provided by Fiddler for extending the application.

Anyway, I had been thinking about writing a ViewState serializer/deserializer for Fiddler for the past month or two when I saw Mike’s blog post.  I decided that it was about time to set aside a little time and write some code.  I was lucky that Fiddler uses .NET, as I was able to leverage all of the system assemblies that Mike had to decompile in Reflector. After a bit of coding I ended up with my ViewStateViewer Fiddler inspector.  Let’s take a quick tour…

ViewStateViewer seamlessly integrates into the Fiddler workflow, allowing a user to manipulate ViewState just as they would any other variable in a HTTP request.  An extremely common scenario for testing involves submitting a request in the browser, trapping the request in a proxy, changing a variable’s value, and forwarding the request on to the server. ViewStateViewer tries to integrate into this workflow as seamlessly as possible.  Upon trapping a request that contains ViewState, ViewStateViewer extract the Base64 encoded ViewState, Base64 decodes it, deserializes the ViewState, and allows a user to manipulate the ViewState as they would any other variable sent to the server.  Let’s take a look at some screenshots to get a better idea of how this works.

ViewStateViewer Tour

Serialized ViewStateSerialized ViewState

By Default, Fiddler lets a user trap requests and view/edit a POST body before submitting the request to the server.  In this case, the POST body contains serialized ViewState that we would like to work with.  Without ViewStateViewer this is non-trivial, as Fiddler only shows us the Base64 encoded serialization.

Deserialized ViewStateDeserialized ViewState

ViewStateViewer adds a new “ViewState” tab within Fiddler that dynamically identifies and deserializes ViewState on the fly.  The top half of the inspector shows the original Base64 serialized ViewState string.  The bottom half of the inspector shows an XML representation of the deserialized ViewState.  In between these two views the user can see if the ViewState is MAC protected and the version of the ViewState being deserialized.  In this case we can see that this is .NET 2.X ViewState and that MAC protection is not enabled.

ReSerialized ViewStateReserialized ViewState

Once the ViewState is deserialized we can manipulate the ViewState by changing the values in the XML representation.  In this example we changed one of the string values to “foobar”, as can be seen in the figure above.  Once we change the value we can reserialize the ViewState using the “encode” button.  The reserialized Base64 encoded ViewState string can be seen in the top half of the ViewStateViewer window.  Once we have “encoded” the ViewState with our modifications, ViewStateViewer automatically updates the POST body with the new serialized ViewState string.  This request can now be “Run to Completion”, which lets Fiddler submit the updated request to the server.

Limitations

It should be noted that if the original ViewState had used MAC protection ViewStateViewer would not be able to reserialize the manipulated ViewState with a valid MAC.  While ViewStateViewer will not prevent you from deserializing, manipulating, and reserializing MAC protected ViewState, it will not be able to append a valid MAC to modified ViewState.  ViewStateViewer will warn us that “Modified ViewState does not contain a valid MAC”.  Modified requests made using reserialized MAC protected ViewState will ultimately fail, as we don’t know the machine key used to produce a valid MAC.  Regardless, sometimes simply being able to view what is being stored in MAC protected ViewState can be useful during an application assessment.

In addition to MAC protection, ViewState can be optionally protected using encryption.  Encryption will prevent any attempt by ViewStateViewer to deserialize the ViewState.  If encryption is detected ViewStateViewer will simply show the user the original Base64 ViewState string.  However, as any application security consultant can attest, there are many applications that do not encrypt or MAC protect their ViewState.  ViewStateViewer is aimed squarely at these use cases.

Finally, ViewStateViewer was written entirely with deserializing/serializing ViewState 2.X in mind. While ViewState 1.X is supported, the support at this time is limited, though completely functional. ViewState 1.X, unlike ViewState 2.X, Base64 decodes into a completely human readable ASCII based serialization format (think JSON).  As such, ViewStateViewer simply Base64 decodes ViewState 1.X and displays the result to the user. The user is then free to make any changes they wish to the decoded ViewState. This works exactly the same as the example shown above, except that the decoded ViewState object will not be a nicely formatted XML document. I might get around to adding true ViewState 1.X support, but the benefits would be purely cosmetic, as the current implementation has no functional limitations.

Wrap Up

ViewState is one of those areas that tends to be under-tested during application assessments, as there have been no good tools for efficiently evaluating the effects of modifying ViewState.  Hopefully the release of ViewStateViewer will make evaluation of ViewState a more common practice.  If you want to give ViewStateViewer a try you can download the Fiddler plugin(with source) here. Simply copy ViewStateViewer.dll into your “Inspectors” folder within your Fiddler install directory.  It should be noted that this inspector is for Fiddler2, so please update your copy of Fiddler if you are out of date.  Finally, this is very much a work in progress.  As I found out during development, there is ton of ViewState out there that is either non-trivial to deserialize/reserialize or is impossible to deserialize/reserialize (non-standard objects being serialized for example).  Maybe I’ll do another blog post detailing some of these difficult/impossible to handle edge cases in a subsequent post.  So, while I have done some limited testing, I am sure there are some bugs that will crop up.  If you happen to find one please don’t hesitate to contact me.


Directory Traversal in Archives

April 21, 2009

By: Greg Ose and Patrick Toomey

I’m sure on the top of everyone’s list of resolutions from the New Year is the ever forgotten “I will write more secure code” and it seems that each year this task gets harder. With more complex and abstracted frameworks and APIs, the ways security related bugs are being introduced to a code base has become equally complex and abstracted. Being a few months into 2009, hopefully we can help you catch up on your resolutions by presenting something else to look for when reviewing or writing secure code.

In recent engagements, we have run into a slew of issues focusing around the well-known vulnerability of directory path traversal. As a refresher, this typically involves injecting file path meta-characters into a filename string to reference arbitrary files and usually results in the modification or disclosure of files on the system. For example, a user supplies the filename /../../etc/passwd which is appended to the path /tmp/uploaded_pictures and ends up referencing the password file instead of a file under the intended directory.

We all know, or at least should know, what a typical directory traversal vulnerability and exploit looks like, however, we have recently seen these issues manifest themselves in the handling of user-provided archive files instead of file path strings. Typically, these user provided files are sent via HTTP uploads. Almost all of the common high-level application APIs provide a means, or a third-party library, to handle archive files. Additionally, almost all of these libraries do not check for potential directory path traversal when they perform the extraction of these files. This puts the liability on the developer to check for malicious archives. While file operation calls with a user controlled variable may be obvious, filenames within user-controlled archives may be the vulnerability that slips by. Developers should not only validate user supplied file paths for directory traversal, but also check file paths included in archive files. As a note, this type of vulnerability has been mentioned before and is not groundbreaking by any means, but we want to take a detailed look into what to be aware of as a developer and how to test for this during vulnerability assessments.

To get started lets take a look at an example provided by Sun themselves (!!!) in a technical article for the java.util.zip package. Code Sample 1 from the article provides their base example for extracting an archive and is shown below.

import java.io.*;
import java.util.zip.*;

public class UnZip {
  final int BUFFER = 2048;
  public static void main (String argv[]) {
    try {
      BufferedOutputStream dest = null;
      FileInputStream fis = new FileInputStream(argv[0]);
      ZipInputStream zis = new ZipInputStream(
                               new BufferedInputStream(fis));
      ZipEntry entry;
      while((entry = zis.getNextEntry()) != null) {
        System.out.println("Extracting: " +entry);
        int count;
        byte data[] = new byte[BUFFER];
        // write the files to the disk
        FileOutputStream fos = new FileOutputStream(
                                   entry.getName());
        dest = new BufferedOutputStream(fos, BUFFER);
        while ((count = zis.read(data, 0, BUFFER)) != -1) {
          dest.write(data, 0, count);
        }
        dest.flush();
        dest.close();
      }
      zis.close();
    } catch(Exception e) {
      e.printStackTrace();
    }
  }
}

We can see where the vulnerability manifests itself in processing each entry of the provided ZIP file:

FileOutputStream fos = new FileOutputStream(entry.getName());

entry is the current ZIP entry being processed and getName() returns the filename stored in that entry. After retrieving this filename, the uncompressed data is written to its value. We can see that by using directory traversal in the filename a malicious user may be able to make arbitrary writes anywhere on the filesystem. Unfortunately, on most platforms, if an attacker can arbitrarily write files they can most likely also get arbitrary code executed on the affected server.

Similar issues exist with a number of ZIP library implementations across various languages. As one might expect, the equivalent Python code is far less verbose. While Python doesn’t provide any sample code, a simple, and vulnerable, ZIP extraction would look as follows:

from zipfile import ZipFile
import sys
zf = ZipFile(sys.argv[1])
zf.extractall()

The extractall method does what one would expect it to do, except that it does not check for directory traversal in the ZIP entries’ file paths. Python also provides equivalent objects for handling tar archives. Interestingly, the tar archive library documentation does make mention of the risk associated with path traversal within archive files. The documentation for the extractall method states:

Warning: Never extract archives from untrusted sources without prior inspection. It is possible that files are created outside of path, e.g. members that have absolute filenames starting with “/” or filenames with two dots “..”.

How about PHP, surely they provide a function to work with ZIP files (what don’t they have a function for). The PHP manual provides the following example code for extracting ZIP files.

<?php
$zip = new ZipArchive;
$res = $zip->open('test.zip');
if ($res === TRUE) {
  echo 'ok';
  $zip->extractTo('test');
  $zip->close();
} else {
  echo 'failed, code:' . $res;
}
?>

Sure enough, this code is also vulnerable to file path manipulation within the archive.

What about everyone’s favorite language du jour, Ruby? Ruby itself does not have ZIP file extraction built in to the language’s core library. However, rubyzip is a popular third-party library and like the prior libraries, is also vulnerable to directory traversal. The example below was stated in a post by the library’s author as how to extract a ZIP file and all of its directories:

require 'rubygems'
require 'zip/zipfilesystem'
require 'fileutils'

OUTDIR="out"

Zip::ZipFile::open("all.zip") {
  |zf|

  zf.each { |e|
    fpath = File.join(OUTDIR, e.name)
    zf.extract(e, fpath)
    FileUtils.mkdir_p(File.dirname(fpath))
  }
}

Finally, similar to Ruby, the .Net environment does not have ZIP archive handling built in to the core library. A quick googling for “.Net zip files” leads to an article on MSDN. In this article, the authors detail this gap in the .Net library and then go on to present a solution. The tools released include a signed DLL for use during development and a set of command-line utility programs that utilize the library. One of these command-line utilities is Unzip.exe. Sure enough, Unzip.exe is vulnerable to path traversal within an archive. No warning is presented and the archive is extracted without concern to the fully resolved path of the files within the archive.

How do mainstream, standalone, compression utility programs handle this vulnerability? We tested a large number of archive extraction programs (Winzip, Winrar, command line Info-Zip, unzip on Unix, etc) and noted that all of them either provide a warning when a ZIP file entry contains directory traversal, escape the meta-characters, or just ignore the traversed directory path all together.

When writing code that interacts with archives, the same precautions used by mainstream extraction utilities must be performed by the developer. As with any user-controlled input, the directory filenames should be validated before being processed by any file operation. The developer should verify that path traversal characters do not occur in any entries within the archive. Similarly, the developer may also leverage utility functions within their language to first determine the fully resolved path before extracting an entry (ex. os.path.normpath(path) in Python).

A more drastic mitigation, though perhaps the better long-term solution, would involve modifying these default libraries to work similarly to their standalone application counterparts by default. It is extremely rare to require path traversal characters in a legitimate archive. Perhaps, the libraries should be modified to secure the common case, requiring a developer to explicitly request the atypical case. For example, what if the Python ZipFile object changed its default behavior to throw an exception in the presence of file traversal characters? The extractall method signature could be modified as follows:

ZipFile.extractall([path[,members[,pwd[,allow_traverse]]]])

By default the allow_traverse is set to False, throwing zipfile.BadZipfile if path traversal characters are encountered. This would provide a secure by default configuration for the library while still allowing the existing behavior if necessary. This requires the developer to explicitly request support for path traversal, thus mitigating accidental and insecure usage. This is unlikely to impact existing code, as archives with path traversal characters are not easy to create and it is extremely unlikely a legitimate archive would accidentally include such characters.

During the course of this write-up we grew tired of hand-editing zip archives in a hex-editor to add directory traversal characters. So, we put together a Python script that can be used to generate ZIP archives with path traversal sequences automatically inserted. It can create directories in both Unix and Windows environments for ZIP files (including jar) and tar files with and without compression (gzip or bzip2). You can specify an arbitrary number of directories to traverse and an additional path to append (think var/www or Windows\System32). The full usage follows:

$ ./evilarc.py --help
Usage: evilarc <input file>

Create archive containing a file with directory traversal

Options:
  --version      show program's version number and exit
  -h, --help     show this help message and exit
  -f OUT, --output-file=OUT
                 File to output archive to.  Archive type is
                 based off of file extension.  Supported
                 extensions are zip, jar, tar, tar.bz2, tar.gz,
                 and tgz.  Defaults to evil.zip.
  -d DEPTH, --depth=DEPTH
                 Number directories to traverse. Defaults to 8.
  -o PLATFORM, --os=PLATFORM
                 OS platform for archive (win|unix). Defaults
                 to win.
  -p PATH, --path=PATH  Path to include in filename after
                 traversal.  Ex:WINDOWS\System32\

The following example shows the file test.txt being added to an archive and extracted to the C:\Windows\System32 directory through the vulnerable Java class we previously discussed:

$ ./evilarc.py test.txt -p Windows\\System32\\
Creating evil.zip containing ..\..\..\..\..\..\..\..\Windows\System32\test.txt

$ java javaunzip evil.zip
Extracting: ..\..\..\..\..\..\..\..\Windows\System32\test.txt

$ ls -al /cygdrive/c/Windows/System32/test.txt
-rwxr-x---+ 1 gose mkgroup-l-d 21 Feb 24 11:52 /cygdrive/c/Windows/System32/test.txt

We have made the script available for download here:

https://github.com/Neohapsis/evilarc


Follow

Get every new post delivered to your Inbox.