Java Cisco Group Password Decrypter

By Patrick Toomey

For whatever reason I have found myself needing to “decrypt” Cisco VPN client group passwords throughout the years.  I say “decrypt” , as the value is technically encrypted using 3DES, but the mechanism used to decrypt the value is largely obfuscative (the cryptographic key is included in the ciphertext).  As such, the encryption used is largely incidental, and any simplistic substitution cipher would have protected the group password equally well (think newspaper cryptogram).

I can’t pinpoint all the root reasons, but it seems as though every couple months I’m needing a group password decrypted.  Sometimes I am simply moving a Cisco VPN profile from Windows to Linux.  Other times I’ve been on a  client application assessments/penetrations test where I’ve needed to gain access to the plaintext group password.  Regardless, I inevitably find myself googling for “cisco group password decrypter”.  A few different results are returned.  The top result is always a link here.  This site has a simple web app that will decrypt the group password for you, though it occurs server-side.  Being paranoid by trade, I am always apprehensive sending information to a third party if that information is considered sensitive (whether by me, our IT department, or a client).  I have no reason to think the referenced site is malicious, but it would not be in my best interest professionally not to be paranoid security conscious, particularly with client information.  The referenced site has a link to the original source code, and one can feel free to download, audit, compile, and use the provided tool to perform all of the decryption client-side.  That said, the linked file depends on libgcrypt.  That is fine if I am sitting in Linux, but not as great if I am on some new Windows box (ok, maybe I’m one of the few who doesn’t keep libgcrypt at the ready on my fresh Windows installs).  I’ve googled around to see if anything exists that is more portable.  I found a few things, including a link to a Java applet, but the developer seems to have lost the source code…..  So, laziness won, and I decided it would be easier to spend 30 minutes to write my own cross-platform Java version than spend any more time on Google.

The code for the decrypter can be found on our github, here.  I am not a huge fan of Java GUI development, and thus leveraged the incredible GUI toolkit built in to NetBeans.  The referenced source code should compile cleanly in NetBeans if you want the GUI.  If you simply want to decrypt group passwords with no dependencies you can run a command line version by compiling the “GroupPasswordDecrypter” class file.  This file has zero dependencies on third-party libraries and should be sufficient for anyone that doesn’t feel compelled to use a GUI (me included).

As a quick example, I borrowed a sample encrypted group password from another server-side implementation .  The encrypted group password we would like to decrypt is

9196FE0075E359E6A2486905A1EFAE9A11D652B2C588EF3FBA15574237302B74C194EC7D0DD16645CB534D94CE85FEC4

Or, if you prefer the command line version

Hope it comes in handy!

How NOT to build your client-server security architecture

By: Patrick Toomey

Traditional client-server and web applications aren’t that different from a security standpoint.  Sure, native UI controls are great for a nice look and feel, and access to native OS APIs is great for creating a high performance application that integrates well with the target platform.  But, from a security standpoint, they are on equal footing.  Neither should trust the client, requiring that developers place all of the security relevant logic on the server.  By and large most web application developers seems to understand that the browser is not to be trusted.  That isn’t to say that all web application developers get it.  There are definitely those web apps that delegate a  too much to the browser, using hidden form fields inappropriately, depending on client-side JavaScript a bit too heavily, amongst other things.  But, on the whole, I have yet to see nearly the number of web applications that were designed with as many fundamental architectural flaws as seems to be common amongst niche vertical market client-server applications.  Wow, did I just say “niche vertical market client-server applications”? What does that mean?  I am talking about the types of client-server applications that were either developed in house, or were bought commercially, but serve a very specific task.  Before I get into trouble for generalizing, yes, I concede that my views are shaped by my experiences.  I am absolutely sure that there are scores of “niche vertical market client-server applications” that are rock solid from a security standpoint.  But, my experiences with these types of applications, combined with recent war stories of some colleagues, compelled me to rant…just a little.

The vast majority of the applications we assess are web based.  Sure, we look at our fair share of client-server applications, but I would say the ratio is probably on the order of 10 to 1.  For whatever reason, we had a string of more traditional client-server applications on our plate in the last few weeks.  Looking at the types of vulnerabilities identified in these applications, as well as other similar applications we have reviewed in the past, it seems as though there is fundamental gap in how a fair percentage of these developers are approaching security.  To illustrate, let’s listen in on a couple of web developers discussing a few pieces of their security architecture for a new web app they are about to deploy.

Web Dev 1:  So, we need to check to make sure only authenticated and authorized users are allowed to access our app.

Web Dev 2:  Yeah, it seems like this whole username password thing is pretty popular.  I vote we go with that.

Web Dev 1:  Ok, yeah, no need to reinvent the wheel here.  So, how should we go about validating the user’s username and password?

Web Dev 2:  Well, we could send an AJAX request to the server, passing the user’s username, and get back all of the user’s information, including their password, and simply validate it in JavaScript.  That way we can keep the logic really clean and simple on the server side of things.

Web Dev 1:  Yeah, that is awesome; with a little more effort we almost don’t need an application server.  We can do everything client-side.

Web Dev 2:  Yeah, we can do all the logic on the client in JavaScript and simply make calls to the backend to store and retrieve data.

Web Dev 1:  Wow, I think we have hit on the next big thing.  All we really need is to make some sort of direct ODBC connection to our database via JavaScript and we’ve nearly collapsed an entire tier out of the three tier architecture…incredible.

Web Dev 2:  Hmmm, yeah, it sounds pretty great, but we are putting a lot of our eggs into one basket.  What if someone figures out how view or manipulate traffic as it traverses the network.  They might be able to do bad stuff.

Web Dev 1:  Yeah, true, there must be some way to prevent our users from mucking around with the network traffic.

Web Dev 2:  I got it!!  We can use SSL.  That uses top notch cryptography that nobody can break.

Web Dev 1:  I think we have it.  But, I thought I remember reading about some sort of tool that lets user’s man in the middle their own SSL traffic, something to do with proxies and self-signed certificates or something?

Web Dev 2:  Hmmmm….that could be a problem, but it sounds like it would be really hard for users to do.

Web Dev 1:  Yeah, nobody will figure it out.  I think we are good to go.

Ok, the above conversation is obviously hyperbolic for effect, but can you get what I am saying here?  The above conversation just wouldn’t happen in a web app world.  Almost every sentence is filled with principles that are antithetical to the web security model.  Validating passwords on the client-side in the browser, connecting directly to the database from the browser (thankfully this doesn’t exist…I googled it just to make sure there wasn’t some inept RFC I wasn’t aware of), or treating SSL like it is some sort of magic crypto juju you can sprinkle over your application to protect yourself against users that have full control of the execution environment.

I have seen some web applications in a pretty sad state of disrepair, but I have never seen a web application that was architected with all of the above fundamental flaws.  Sadly, on the client-server side of things I have seen each of the above approaches used with regularity.  Applications that validate passwords by requesting them from the server, applications that treat the server simply as an ODBC connection, and applications that depend on SSL as their sole security control for preventing malicious user input and/or the disclosure of highly sensitive information (ex. those passwords being send back to the client from the server).  So, in short, if you are writing a client-server application, think of your client as the browser, and treat it with the same degree of trepidation.  The client cannot be trusted to enforce your security policy, and any expectation that it will do so is likely to result in an architectural flaw that may very well subvert the entirety of your security model.

Even if You Don’t Invent Your Own Crypto….It’s Still Hard

By: Patrick Toomey

So, yet another crypto related post.  Often times crypto related attacks fall on deaf ears, as for most people crypto is like magic (include me in that group every now and again), and it becomes difficult for the average developer to differentiate the risks associated with the attacks circulated within the security community.  There is a vast chasm between the crypto community and just about everyone else.  The cryptographers will talk about differential cryptanalysis, meet in the middle attacks, perfect forward secrecy, etc, etc.  Often, the cryptographers’ goals are different than your average developer.  Cryptographers are trying to achieve perfection.  For them, any flaw in a cryptographic protocol is something to be discussed, researched, and fixed.  As is a famous saying in the crypto community, “Attacks only get better”.  So, even the smallest flaw is worth their time and effort, as it may lead to a practical crack that will warrant more immediate attention.

Unlike cryptographers, for the vast majority of developers, the goal is “good enough” security.  Often times the result is not good enough, but the goal is admirable, as it doesn’t make sense to over invest in security.  As a result, there is a chasm between “good enough” for cryptographers and “good enough” for developers.  In this chasm are the real world attacks that are too boring for your average cryptographer to discuss, yet too subtle as to be obvious to your average developer.  Many of these attacks have nothing to do with the cryptography itself.  Instead, many real world crypto attacks have more to do with how developers piece together proven crypto building blocks insecurely.  It was on a recent assessment that I came upon the following story that perfectly illustrates how cryptography is difficult to get right, even when you think you have made an informed decision.

On a recent assessment a colleague of mine came across a typical SQL injection attack.  In fact, it was just the kind of SQL injection you hope for as an attacker, as it returned data to the user in a nice tabular output format.  After a bit of mucking around we managed to find where the system was storing credit card information for its users.  A few minutes later we had output that looked like the following:

Base64 Credit Card Numbers

Base64 Credit Card Numbers

Hmmm, base64 encoded credit card numbers.  I wonder what they look like when we base64 decode them.  A simple base64 decode resulted in the following hex values:

Base64 Decoded Credit Card Numbers

Base64 Decoded Credit Card Numbers

If you start trying to translate the above hex values to their ASCII equivalent you will quickly find that the values do not represent plaintext ASCII credit card numbers.  Our first guess was that they were likely encrypted.  So, step two is trying to figure out how they were encrypted.  We took a look at the base64 decoded values and observed a simple fact: the vast majority of them were either 15 or 16 bytes long.  Well, a 16 byte long ciphertext could be any 128 bit symmetric algorithm in simple ECB mode.  But, the 15 byte entries were evidence against that, as they would get padded out to 16 bytes due to the requirement that symmetric algorithms encrypt a full block (16 bytes in the case of AES).  So, my first thought was the use of a stream cipher.  In their most basic form, a stream cipher simply takes a secret value as input (the key), and begins to generate a pseudorandom output stream of bytes.  Encryption simply involves XORing the stream of random bytes with your plaintext.  Unlike a block cipher, there is no need to pad out your plaintext, as you only use as many pseudorandom bytes as is necessary to XOR with your plaintext.  Decryption is also simple, as XORing is a trivially invertible operation.  The user simply feeds the algorithm the same key, generates the same pseudorandom byte stream, and XORs that with the ciphertext.

To see why this works let’s take a look at how this works for a single plaintext/ciphertext.  Let us say we wish to encrypt the credit card number “4012888888881881”. The ASCII values in hex result in a plaintext array of the following values:

Plaintext Credit Card Number ASCII in Hex

Plaintext Credit Card Number ASCII in Hex

Also, let us assume that we are using a stream cipher with a strong key whose first 16 output bytes are:

Pseudorandom Byte Stream in Hex

Pseudorandom Byte Stream in Hex

When we XOR the plaintext with the pseudorandom stream we obtain the ciphertext.

Resulting Ciphertext

Resulting Ciphertext

To decrypt we simply take the ciphertext and perform the same XOR operation.  Let’s quickly take our ciphertext and see how we decrypt it.

Decrypted Ciphertext

Decrypted Ciphertext

When we XOR the ciphertext with the stream we obtain the plaintext we started with.  Through the whole encryption/decryption process we have effectively XORed the plaintext with the stream twice.  XORing anything with the same value twice is effectively a NOP on the original value.  Pictorially:

Full Encryption Decryption Cycle

Full Encryption Decryption Cycle

However, our examples so far have assumed knowledge of the pseudorandom stream.  Without knowledge of the key it is difficult to reproduce the pseudorandom stream, and thus very difficult to perform the steps we just outlined to obtain the plaintext for a given ciphertext.  Moreover, we don’t even know what algorithm is in use here.  Did the developers use RC4?  Did they try to make their own pseudorandom generator by repetitively hashing a secret key?  We didn’t really know, but  regardless of the algorithm, there is inherent risk with leveraging a stream cipher in its most basic form.  If the stream cipher uses a shared key, the pseudorandom stream will be identical every time it is initialized.  To test this conjecture we used several test user accounts  to register several credit cards.  Sure enough, identical credit card numbers resulted in identical ciphertext.  So, what can we do with this?  We have no idea what algorithm was used, and we have no idea of what the value of the secret key is.  However, it turns out that we don’t need any of that information.  To see why let’s first take a look at the hex of two encrypted credit cards we base64 decoded above.

First Encrypted Credit Card

First Encrypted Credit Card

The above shows the 16 ciphertext bytes for one of the credit card numbers in the database.

Second Encrypted Credit Card

Second Encrypted Credit Card

The above shows the 16 ciphertext bytes for a second credit card number in the database.

Let’s assume that the ciphertext from the first encrypted credit card was ours, where we know the plaintext ASCII hex values for the card.

Plaintext Bytes for the First Credit Card

Plaintext Bytes for the First Credit Card

Let’s also assume that the second credit card represents a credit card of unknown origin (i.e.  all we can observe is the ciphertext.  We have no ideas what the digits of the card are).  Similarly, without knowledge of the key, we have no idea how to generate the pseudorandom byte stream used to encrypt the cards.  But, what happens if we simply XOR these two ciphertexts together.

XOR of the Two Encrypted Credit Cards

XOR of the Two Encrypted Credit Cards

Well, that doesn’t really look like it got us anywhere, as we still don’t have anything that looks like an ASCII credit card number.  However, let’s take a look at what actually happened when we did the above operation.  We are confident that both credit card numbers were encrypted with the same pseudorandom byte stream (remember that encryption of the same credit card under different users resulted in the same ciphertext).  Furthermore, we know that the first credit card number is “4012888888881881”, as that is our own credit card number.  The key insight is that we have effectively XORed four different byte sequences together.  We have XORed the plaintext bytes of our credit card, the plaintext bytes of an unknown credit card, and the same pseudorandom sequence twice.

Encrypted Credit Card XOR Expanded

Encrypted Credit Card XOR Expanded

In the above picture the unknown plaintext credit card number is represented with a series of “YY”s.  Similarly, the unknown pseudorandom stream is represented with a series of “XX”s.  However, despite the fact that we don’t know the value of the pseudorandom stream, we do know they are the same.  As a result, whenever you XOR anything with the same value twice you effectively create a NOP. So, the above three XOR operations can be simplified to the following:

Simplified XOR of Encrypted Credit Cards

Simplified XOR of Encrypted Credit Cards

So, what we actually did by XORing the two ciphertexts is create a new value that is the XOR of the two plaintexts.  Wow, that is kind of cool.  We just got rid of the pseudorandom byte stream that is supposed to provide all the security without ever knowing the key.  But, what do we do now?  We still don’t have plaintext credit card numbers yet.  Ah, but we do!  You can think of what we now have as the unknown credit card number encrypted with our known credit card number as the pseudorandom stream.  In other words, we can simply XOR our result with our known credit card value again and we will obtain the resulting plaintext for the unknown credit card number.

Decrypted Credit Card

Decrypted Credit Card

One you translate the resulting hex into the equivalent ASCII string you obtain “5362016689976147”, a valid MasterCard number (obviously we have used test credit card numbers throughout this blog entry).  We can simply apply the above steps to recover the full unencrypted credit card numbers for the entire database.  We have effectively used our own encrypted credit card number as a key to decrypt all of the other unknown credit cards in the database.  This is a pretty impressive feat for not knowing the algorithm and/or the secret key used to encrypt all of the credit card information.

As it turns out, we eventually received source code for the functionality under investigation and discovered that the developers were using RC4.  RC4 has had its fair share of issues, and is generally not a recommended algorithm, but none of what we just discussed had to do with any weakness with RC4.  The above attack would have worked on any stream cipher used in the same way.  So, as is typical, the attack had more to do with how cryptography is used than the cryptographic algorithm itself.  We can’t say it enough….crypto is hard.  It just is.  There are just too many things that developers need to get right for crypto to be done correctly.  Algorithm selection, IVs, padding, modes of encryption, secret key generation, key expiration, key revocation…the list goes on and on.  We in the security community have been pounding the “don’t invent your own cryptography” mantra for a long time.  However, that has lead to a swarm of developers simply leveraging proven cryptography insecurely.  So, we need to take this one step further.  On average, it is best for developers to defer to proven solutions that abstract the notion of cryptography even further.  For example, GPG, Keyczar, et al have abstractions that simply let you encrypt/decrypt/sign/verify things based off of a secret that you provide.  They handle proper padding, encryption mode selection, integrity protection, etc.  That is not to say that there aren’t use cases for needing to delve into the weeds sometimes, but for the average use case a trusted library is your friend.



ViewStateViewer: A GUI Tool for deserializing/reserializing ViewState

By: Patrick Toomey

Background

So, I was reading the usual blogs and came across a post by Mike Tracy from Matasano (Matasano has been having some technical difficulties…this link should work once they have recovered).  In the blog post Mike talks about the development of a ViewState serializer/deserializer for his WWMD web application security assessment console  (please see Mike’s original post for more details).  Mike noted that the tools for viewing/manipulating ViewState from an application testing perspective are pretty weak, and I can’t agree more.  There are a number of ViewState tools floating around that do a tolerable job of taking in a Base64 encoded ViewState string and presenting the user with a static view of the deserialized object. However, no tools up until this point, save for Mike’s new implementation, have allowed a user to change the values within the deserialized object and then reserialize them into a new Base64 string. Mike’s tool does exactly this, and is immensely useful for web application testing. The only downside to Mike’s tools is that it is built for his workflow and not mine (how can I fault the man for that).  So, I decided to build an equivalent that works well for me (and hopefully for you as well).

I tend to have a web proxy of some sort running on my machine throughout the day. There are tons of them out there and everyone seems to have their personal favorite. There is Paros, WebScarab, BurpSuite, and I am sure many others. In the last few months I have been using a newer entrant into the category, Fiddler.  Fiddler is a great web proxy whose only big drawback is that it is Windows only.  However, at least for me, the upsides to Fiddler tend to outweigh the negatives.  Fiddler has a fairly refined workflow (don’t get me started on WebScarab), is stable (don’t get me started on Paros), and is pretty extensible.  There are a number of ways to extend Fiddler, most trivially using their own FiddlerScript hooks.  In addition, there is a public API for extending the application using .NET.  Fiddler has a number of interfaces that can be extended to allow for inspecting and manipulating requests or responses. Please see the Fiddler site for more details on extending Fiddler using either FiddlerScript or .NET.  In particular, take a look at the development section to get a better feel for the facilities provided by Fiddler for extending the application.

Anyway, I had been thinking about writing a ViewState serializer/deserializer for Fiddler for the past month or two when I saw Mike’s blog post.  I decided that it was about time to set aside a little time and write some code.  I was lucky that Fiddler uses .NET, as I was able to leverage all of the system assemblies that Mike had to decompile in Reflector. After a bit of coding I ended up with my ViewStateViewer Fiddler inspector.  Let’s take a quick tour…

ViewStateViewer seamlessly integrates into the Fiddler workflow, allowing a user to manipulate ViewState just as they would any other variable in a HTTP request.  An extremely common scenario for testing involves submitting a request in the browser, trapping the request in a proxy, changing a variable’s value, and forwarding the request on to the server. ViewStateViewer tries to integrate into this workflow as seamlessly as possible.  Upon trapping a request that contains ViewState, ViewStateViewer extract the Base64 encoded ViewState, Base64 decodes it, deserializes the ViewState, and allows a user to manipulate the ViewState as they would any other variable sent to the server.  Let’s take a look at some screenshots to get a better idea of how this works.

ViewStateViewer Tour

Serialized ViewStateSerialized ViewState

By Default, Fiddler lets a user trap requests and view/edit a POST body before submitting the request to the server.  In this case, the POST body contains serialized ViewState that we would like to work with.  Without ViewStateViewer this is non-trivial, as Fiddler only shows us the Base64 encoded serialization.

Deserialized ViewStateDeserialized ViewState

ViewStateViewer adds a new “ViewState” tab within Fiddler that dynamically identifies and deserializes ViewState on the fly.  The top half of the inspector shows the original Base64 serialized ViewState string.  The bottom half of the inspector shows an XML representation of the deserialized ViewState.  In between these two views the user can see if the ViewState is MAC protected and the version of the ViewState being deserialized.  In this case we can see that this is .NET 2.X ViewState and that MAC protection is not enabled.

ReSerialized ViewStateReserialized ViewState

Once the ViewState is deserialized we can manipulate the ViewState by changing the values in the XML representation.  In this example we changed one of the string values to “foobar”, as can be seen in the figure above.  Once we change the value we can reserialize the ViewState using the “encode” button.  The reserialized Base64 encoded ViewState string can be seen in the top half of the ViewStateViewer window.  Once we have “encoded” the ViewState with our modifications, ViewStateViewer automatically updates the POST body with the new serialized ViewState string.  This request can now be “Run to Completion”, which lets Fiddler submit the updated request to the server.

Limitations

It should be noted that if the original ViewState had used MAC protection ViewStateViewer would not be able to reserialize the manipulated ViewState with a valid MAC.  While ViewStateViewer will not prevent you from deserializing, manipulating, and reserializing MAC protected ViewState, it will not be able to append a valid MAC to modified ViewState.  ViewStateViewer will warn us that “Modified ViewState does not contain a valid MAC”.  Modified requests made using reserialized MAC protected ViewState will ultimately fail, as we don’t know the machine key used to produce a valid MAC.  Regardless, sometimes simply being able to view what is being stored in MAC protected ViewState can be useful during an application assessment.

In addition to MAC protection, ViewState can be optionally protected using encryption.  Encryption will prevent any attempt by ViewStateViewer to deserialize the ViewState.  If encryption is detected ViewStateViewer will simply show the user the original Base64 ViewState string.  However, as any application security consultant can attest, there are many applications that do not encrypt or MAC protect their ViewState.  ViewStateViewer is aimed squarely at these use cases.

Finally, ViewStateViewer was written entirely with deserializing/serializing ViewState 2.X in mind. While ViewState 1.X is supported, the support at this time is limited, though completely functional. ViewState 1.X, unlike ViewState 2.X, Base64 decodes into a completely human readable ASCII based serialization format (think JSON).  As such, ViewStateViewer simply Base64 decodes ViewState 1.X and displays the result to the user. The user is then free to make any changes they wish to the decoded ViewState. This works exactly the same as the example shown above, except that the decoded ViewState object will not be a nicely formatted XML document. I might get around to adding true ViewState 1.X support, but the benefits would be purely cosmetic, as the current implementation has no functional limitations.

Wrap Up

ViewState is one of those areas that tends to be under-tested during application assessments, as there have been no good tools for efficiently evaluating the effects of modifying ViewState.  Hopefully the release of ViewStateViewer will make evaluation of ViewState a more common practice.  If you want to give ViewStateViewer a try you can download the Fiddler plugin(with source) here. Simply copy ViewStateViewer.dll into your “Inspectors” folder within your Fiddler install directory.  It should be noted that this inspector is for Fiddler2, so please update your copy of Fiddler if you are out of date.  Finally, this is very much a work in progress.  As I found out during development, there is ton of ViewState out there that is either non-trivial to deserialize/reserialize or is impossible to deserialize/reserialize (non-standard objects being serialized for example).  Maybe I’ll do another blog post detailing some of these difficult/impossible to handle edge cases in a subsequent post.  So, while I have done some limited testing, I am sure there are some bugs that will crop up.  If you happen to find one please don’t hesitate to contact me.

Hulu…client-side “encryption”…seriously?

By: Patrick Toomey

I remember being pretty excited by the prospect of a service like Hulu.   The idea that major networks were actually coming together to stream mainstream video content was impressive.  It was such a departure from the locked down, share nothing, mentality of old.   I thought to myself, “Wow, does Hollywood finally get it?”. Apparently my optimism was exactly that…optimistic.

Sometime in the last week or so it was reported that Hulu, a video streaming service run by NBC and FOX, started “encrypting” Ajax responses to block unauthorized software clients (Boxee et al.) from sidestepping the hulu.com website to view content.  However, encryption is purposefully in quotes, as what Hulu actually implemented is a client-side obfuscation mechanism.  It it well known that such protection mechanisms are flawed by design and bound to be circumvented quickly.

The protective measure that is implemented rests on the obfuscation of Ajax responses made against hulu.com.  Instead of returning plaintext HTML content, Ajax requests return obfuscated URL encoded strings.  These URL encoded strings are reverted to plaintext on the client-side using JavaScript.  For example, a request to:

http://www.hulu.com/channels/Home-and-Garden?kind=videos&sort=popularity

returns a URL encoded string that begins:

dobfu__%F2%9E%84%88%EE%99%81%9F%BD%89%D0%DC …

The entire string is approximately 141KB long.  Other than the “dobfu__” prefix, the remainder of the string is URL encoded.  This obfuscated string is transformed into plaintext by a JavaScript function called “_dobfu()”.  This function, after a bit of reformatting, is reproduced below:

function _dobfu(text) {
  return text.slice(0,7)!='dobfu__'?text:
    $A(unescape(text.substring(7)).tol()).map(function(i) {
      i=0xfeedface^i;
      return String.fromCharCode(i&0xFF,i>>>8&0xFF,i>>>16&0xFF,i>>>24&0xFF);
    }
  ).join('').replace(/\+$/,'');
}

All of the above code is pretty easy to follow, save for the references to $A() and the the tol() functions.  The $A() function is a Prototype global function that creates a full array object from any other object that can pass for an array (supports indexing, etc).  This is done so that the new object inherits the full functionality of an array (the map method is needed in this case).  The second piece of ambiguous logic , the tol() method, is defined in another JavaScript file and is reproduced below:

String.prototype.tol=function(){
  var s=this;
  return $R(0,Math.ceil(s.length/4)-1).map(
    function(i){
      return s.charCodeAt(i*4)+(s.charCodeAt(i*4+1)<<8)+(s.charCodeAt(i*4+2)<<16)+(s.charCodeAt(i*4+3)<<24);
    }
  );
};

Essentially this method takes a string of bytes and creates an array of 32-bit integers from each 4-byte chunk.  For example, if the string processed in the method was “\x01\x23\x45\x67\x89\xab\xcd\xef” the method would return the array [0x67452301, 0xefcdab89].  The ordering of the individual bytes is a result of the “tol()” method parsing the data as little-endian.

So, with those two functions defined we can quickly describe how Hulu de-obfuscates responses.  The obfuscated string is broken up into 4-byte integers.  Since the length of the obfuscated string is always evenly divisible by four we are guaranteed that a string of length x will turn into an array of 4-byte integers of length x/4.  Then, for each 4-byte integer, the value is XORed with the constant “0xfeedface”.  Once XORed, the individual bytes from the integer are split apart and converted back to their equivalent ASCII value.  Finally, all trailing NULL bytes are removed from the de-obfuscated string.

It is a bit difficult to imagine what Hulu thought they might accomplish with the above scheme.  It effectively does nothing to prevent third-party tools from performing the same obfuscation/de-obfuscation.  Any scheme that attempts to implement client-side “decryption”, particularly in JavaScript, is bound for failure.  The client possesses the obfuscated message, the key to de-obfuscate the message, and the Javascript that executes the algorithm.   Using these components, it is a trivial exercise to transform any obfuscated response back into plaintext.  Hulu likely thwarted unauthorized software for the better part of an afternoon and no more.  Client-side security mechanisms simply don’t work.  Even complex systems implemented in native code, such as popular DRM schemes, that may go unbroken for a period of time, will eventually be circumvented.  However, to implement a similar preventative measure in JavaScript lowers the difficulty of circumvention dramatically.

Beyond the technical discussion there is also a more broad question to be asked.  What was the net gain for Hulu?  They failed to accomplish their implicit goal: to block unauthorized software.  Hulu simply received another  bit of bad press for treating their customers like thieves.  Hulu, and other such services, need to realize that the ubiquitous availability of their content will ultimately grow their fan base.  There is ever increasing competition for a viewer’s eyes and ears.  Podcasts, YouTube, gaming, etc are all competing.  Third-party products, such as Boxee, only serve to increase the ubiquity of their content, which shouldn’t be viewed as a bad thing.  Thwarting their own customers only sours the experience and reinforces the presumption that a good chunk of the entertainment industry just doesn’t get it.  Besides being bad security, this latest debacle is just bad business.

Crypto Pet Peeves: Hashing…Encoding…It’s All The Same, Right?

Patrick Toomey

© 2008 Neohapsis

We all know cryptography is hard. Time and time again we in the security community give advice that goes something like, “Unless you have an unbelievably good reason for developing your own cryptography, don’t!”. Even if you think you have an unbelievably good reason I would still take pause and make sure there is no other alternative. Nearly every aspect of cryptography is painstakingly difficult: developing new crypto primitives is hard, correctly implementing them is nearly just as hard, and even using existing crypto APIs can be fraught with subtlety. As discussed in a prior post, Seed Racing, even fairly simple random number generation is prone to developer error. Whenever I audit source I keep my eyes open for unfamiliar crypto code. So was the case on a recent engagement; I found myself reviewing an application in a language that I was less familiar with: Progress ABL.

Progress ABL is similar to a number of other 4GL languages, simplifying development given the proper problem set. Most notably, Progress ABL allows for rapid development of typical business CRUD applications, as the language has a number of features that make database interactions fairly transparent. For those of you interested to learn more, the language reference manual can be found on Progress’ website.

As I began my review of the application I found myself starting where I usually do: staring at the login page. The application was a fairly standard web app that required authentication via login credentials before accessing the sensitive components of the application. Being relatively unfamiliar with ABL, I was curious how they would handle session management. Sure enough, just as with many other web apps, the application set a secure cookie that uniquely identifies my session upon login. However, I noticed that the session ID was relatively short (sixteen lower/upper case letters and four digits). I decided to pull down a few thousand of the tokens to see if I noticed any anomalies. The first thing I noticed was that the four digit number on the end was obviously not random, as values tended to repeat, cluster together, etc. So, the security of the session ID must lie in the sixteen characters that precede the four digits. However, even the sixteen characters did not look so random. Certain letters appeared to occur more than others. Certain characters seemed to follow other characters more than others. But, this was totally unscientific; strange patterns can be found in any small sample of data. So, I decided to do a bit more scientific investigation into what was going on.

Just to confirm my suspicions I coded up a quick python script to pull down a few thousand tokens and count the frequency of each character in the token. Several minutes later I had a nice graph in excel.

Histogram of Encode Character Frequency
Histogram of Encode Character Frequency

Ouch! That sure doesn’t look very random. So, I opened up Burp Proxy and used their Sequencer to pull down a few thousand more session cookies. The Burp Sequencer has support for running a number of tests, including a set of FIPS-compliant statistical tests for randomness. To obtain a statistically significant result Burp analyzes a sample size of 20,000 tokens. Since I saw that the four digit token at the end of the session ID provided little to no entropy, I discarded them from the analysis. It seemed obvious that the sixteen character sequence was generated using some sort of cryptographic hash, and the four digit number was generated in some other way. I was more interested in the entropy provided by the hash. So, after twenty minutes of downloading tokens, I let Burp crunch the numbers. About 25 seconds later Burp returned an entropy value of 0 bits. Burp returned a graph that looked like the one below, showing the entropy of the data at various significance levels.

Encode Entropy Estimation
Encode Entropy Estimation

Hmmm, maybe Burp is broken. I was pretty sure I had successfully used the Burp Sequencer before. Maybe it was user error, a bug in the current version, who knows. I decided that a control was needed, just to ensure that the tool was working the way I thought it should. So, I wrote a bit more python to simply print the hex-encoded value of a SHA1 hash on the numbers 1-20,000. I loaded this data into Burp and analyzed the data set. Burp estimated the entropy at 153 bits. Just to compare with the prior results, here is the distribution graph and the Burp entropy results for the SHA1 output:

Histogram of SHA1 Character Frequency
Histogram of SHA1 Character Frequency

SHA1 Entropy Estimation
SHA1 Entropy Estimation

I repeated the same test against a set of JSESSIONID tokens and found a similarly acceptable result. Ok, so the Burp Sequencer seems to be working.

So, I next went hunting for the session token generation code in the application. After a little greping I found the function for generating new session tokens. Ultimately the function took a number of values and ran them through a function called “ENCODE”. Hmmm, ENCODE, that didn’t sound familiar. Some more greping through the source did not reveal any function definitions, so I assumed the function must be part of the standard library for ABL. Sure enough, on page 480 of the language reference manual there was a description of the ENCODE function.

“Encodes a source character string and returns the encoded character string result”

The documentation then goes on to state:

“The ENCODE function performs a one-way encoding operation that you cannot reverse.  It is useful for storing scrambled copies of passwords in a database. It is impossible to determine the original password by examining the database. However, a procedure can prompt a user for a password, encode it, and compare the result with the stored, encoded password to determine if the user supplied the correct password.”

That is the least committal description of a hash function I’ve ever had the pleasure reading. It turns out the application, as well as a third party library the application depends upon, uses this function for generating session tokens, storing passwords, and generating encryption keys. For the sake of reproducibility I wanted to be sure my data was not the result of some strange artifact in their environment. I installed the ABL runtime locally and coded up a simple ABL script to call ENCODE on the numbers 1-20000. I reran the Burp Sequencer and got the exact same result, 0 bits.

At this point I was fairly sure that ENCODE was flawed from a hashing perspective. A good quality secure hash function, regardless of how correlated the inputs are (as the number 1-20000 obviously would be), should produce output that is indistinguishable from truly random values (see Cryptographic Hash Functions and  Random Oracle Model for more information). ENCODE clearly does not meet this definition of a secure hash function. But, 0 bits, that seems almost inconceivably flawed.  So, giving them the benefit of the doubt, I wondered if the result is dependent on the input. In other words, I conjectured that ENCODE might perform some unsophisticated “scrambling” operation on the input, and thus input with low entropy will have low entropy on the output. Conversely, input with high entropy might retain it’s entropy on output. This still wouldn’t excuse the final result, but I was curious none the less. My final test was to use the output of my SHA1 results and feed them each through the ENCODE function. Since the output of the SHA1 function contains high entropy I conjectured that ENCODE, despite its obvious flaws, might retain this entropy. The results are shown below:

Histogram of SHA1 then Encode Character Frequency
Histogram of SHA1 then Encode Character Frequency

SHA1 then Encode Entropy Estimation
SHA1 then Encode Entropy Estimation

ENCODE manages to transform an input with approximately 160 bits of entropy into an output that, statistically speaking, contains 0 bits of entropy. In fact, the frequency distribution of the character output is nearly identical to the first graph in this post.

This brings me back to my opening statement, “Unless you have an unbelievably good reason for developing your own cryptography, don’t!”. I can’t figure out why this ENCODE function exists? Surely the ABL library has support for a proper hash function like SHA1, right? Yes, in fact it does. The best explanation I could come up with is that it is a legacy API call. If that is the case then the call should be deprecated and/or  documented as suitable only in cases where security is of no importance. The current API does the exact opposite, encouraging developers to use the function for storing passwords. Cryptography is hard, even for those of us that understand the subtlety involved. Anything that blurs the line between safe and unsafe behavior only makes the burden on developers even greater.

It is unclear, based on this analysis, how much effort it would require to find collisions in ABL’s ENCODE function. But, even this simple statistical analysis should be enough for anyone to steer clear of its use for anything security related. If you are an ABL developer I would recommend that you try replacing ENCODE with something else. As a trivial example, you could try: HEX-ENCODE(SHA1-DIGEST(input)). Obviously you need to test and refactor any code that this breaks. But, you can at least be assured that SHA1 is relatively secure from a hashing perspective. That said, you might want to start looking at SHA-256 or SHA-512, given the recent chinks in the armor of SHA1:

Unfortunately, it does not appear that ABL has support for these more contemporary SHA functions in their current release.

Ok….slowly stepping down off my soapbox now.   Bad crypto just happens to be one of my pet peeves.

Footnote:

Just before posting this blog entry I decided to email Progress to see if they were aware of the behavior of the ENCODE function.  After a bit a few back and forth emails I eventually got an email that desribed the ENCODE function as using a CRC-16 to generate it’s output (it is not the direct output, but CRC-16 is the basic primitive used to derive the output).  Unfortunately, CRCs were never meant to have any security gurantees.  CRCs do an excellent job of detecting accidental bit errors in a noisy transmission medium.  However, they provide no gurantees if a malicous user tries to find a collision.  In fact, maliciously generating inputs that produce identical CRC outputs is fairly trivial.  As an example, the linearity of the CRC-32 alogirthm was noted as problematic in an analysis of WEP.   Thus, despite the API doc recommendation, I would highly recommend that you not use ENCODE as a means of securely storing your user’s passwords.