CVSS – Vulnerability Scoring Gone Wrong

By Patrick Toomey

If you have been in the security space for any stretch of time you have undoubtedly run across the Common Vulnerability Scoring System (CVSS).  CVSS attempts to provide an “objective” way to calculate a measure of risk associated with a given vulnerability based on a number of criteria the security community has deemed worthwhile.  While I admire the goals of such a scoring system, in practice I think it falls short, and over-complicates the issue of assigning risk to vulnerabilities.  Before we get into my specific issues with CVSS, let’s briefly review how a CVSS score is calculated.  Put simply, the calculation tries to take into account criteria such as:

  • Exploitability Metrics (i.e. probability)
  • Impact Metrics (i.e. severity)
  • Temporal Metrics (extra fudge factors for probability)
  • Environmental Metrics (extra fudge factors for severity)

Each of the above categories is composed of a number of questions/criteria that are used as input into a calculation that results in a value between 0.0 and 10.0.  This score is often reported with publically disclosed vulnerabilities as a means of conveying the relative importance of fixing/patching the affected software.     The largest source of public CVSS scores comes from the National Vulnerability Database (NVD), as they have XML documents that contain a CVSS score for every CVE from 2002 to 2012.  In addition to the  NVD, I’ve also seen CVSS used by various security tools as well as used internally by numerous organizations, as it doesn’t require reinventing the wheel when ranking vulnerabilities.   So, what’s wrong with CVSS?

There are so many things I dislike about CVSS, though I will freely admit I am not steeped in CVSS lore, and would be open to hearing/discussing the reasoning behind the scoring system.  That said, here are my issues with CVSS in no particular order.

We don’t measure football fields in inches for a reason

Nobody cares that the distance between goal lines on an American football field is 3600 inches.  Why?  Because it is a useless unit of measurement when we are talking about football.  Nobody cares if someone has made 2 inches of progress on the field, as yards are the only thing that matters.  Similarly, what is an organization supposed to take away from a CVSS score that can take on 100 potential values?  Is a 7.2 any better than a 7.3 when it comes down to whether someone is deciding to fix something or not?  A reasonable argument against CVSS being too fine grained is that you can always bubble the result into a more coarse unit of measure.  But, that leads to my second complaint.

The “fix” is broken

So, sure, 100 distinct values is overkill for ranking vulnerabilities, and CVSS acknowledges this to some degree by mapping the overall score to a “severity score” of High, Medium and Low.  On the surface this seems reasonable, as it abstracts the ugly sausage making details of the detailed CVSS score into a very actionable severity score.  But, I feel like they managed to mess this up as well.  They started with a pretty fine granularity and bubbled up to something that is too coarse, as it tends to blur together various high severity vulnerabilities.  I’ve always been a fan of a four point  score that breaks down as follows:

  • Critical – The vulnerability needs to have been fixed yesterday.  The entire team responsible will not sleep until the vulnerability has been fixed.
  • High – This vulnerability is serious and we are going to fix it in the near term, but we also don’t need to make everyone lose sleep over it.
  • Medium – This vulnerability is worth fixing, and we will set a relatively fixed date in the near future for when it will be fixed.
  • Low – This vulnerability is on our radar and if it fits in our next release schedule we will fix it.
As it happens, a fairly large project manages to get by pretty well using a system roughly analogous to the one described above.  Google’s Chrome project has used a similar rating system and I haven’t heard anyone complain.     I was curious how this mapping would work against CVSS scores so I plotted all of the CVSS scores for every CVE within the NVD from 2002 until 2012.  The result are as follows:

As can be seen, there are some pretty obvious groupings of scores within this data.  Without staring at the data too hard you can see that there are clearly four groupings of scores that would map very cleanly to the four point system I mentioned earlier.

The main thing to make note of here is that there is a vast chasm between each grouping and its nearest neighbor(s).  There is very little chance of mistaking a low vulnerability for a medium vulnerability.  In contrast, with the current CVSS scoring system the grouping looks more like this:

There is some seemingly arbitrary dividing lines between High, Medium, and Low scores.  Particularly troubling is the dividing line between Medium and High.  Anything scored less than a 7 is a Medium risk and anything greater is a High.  Unfortunately, there is a fair bit of data clustered at exactly that juncture.  This leads to my final complaint against CVSS.

Objectivity is in the eye of the beholder

As mentioned in the beginning of the blog entry, a CVSS score is based on some base metric, but can be adjusted using a number of “Temporal” and “Environmental” metrics.  In other words, given a base score, you can just tweak it how ever you want using a number of fuzzy criteria.  This, compounded with the coarse High, Medium, Low severity scores, leads to a troubling amount of score fiddling.  I am not going to go all conspiracy theory on you and claim people are fudging numbers for publically disclosed CVEs.  But, I have seen internal groups within companies leveraging these additional metrics to make the data fit their desired outcome.  I can’t blame them, as it is almost a requirement.  When presented a vulnerability there is generally an internal consensus about how serious this vulnerability is to the organization and whether it is a Critical, High, Medium, or Low (as I defined them above).  However, once they enter all of the base metrics into the CVSS calculator there is a reasonable chance that it is going to give you a score that doesn’t mesh with their gut.  So, adjustments are made to the temporal metrics and environmental metrics until it gives them the appropriate score.  Again, I blame nobody for “fudging” the data, as often times the base score just doesn’t work.  One could argue that the temporal and environmental scores could be adjusted in a reliable/repeatable way for a given application/environment.  Then, anytime a vulnerability is identified in that specific application then the same temporal/environmental adjustments could be used to create reliable/repeatable scores.  In reality, this doesn’t happen.  An organization should be praised for using any kind of scoring system at all.  To try to enforce an extra level of unnecessary/burdensome process is not worthwhile or realistic.

Conclusion

Even with all the above being said, as soon as you pitch the idea of using a four point scoring system you run into the problem of objectivity.  How do we decide what criteria delineates a Critical from a High vulnerability?  I am sure that is how CVSS started, as it provided an approach for scoring things objectively.  But, as we already discussed, it is only superficialy objective, as there are numerous ways to adjust the score using subjective metrics.  So, why bother?  I think following a model similar to the Chrome severity guidelines makes more sense.  The Chrome team has developed some specific criteria they use to group vulnerabilities.  Given that they are only trying to place a vulnerability into one of four buckets it isn’t that difficult.  Most organizations could come up with a similar set of organization specific criteria for assigning a vulnerability score.  In the end, while I am a fan of standardization in general, I am not a fan of the current standard for vulnerability scoring.  Not to be to cliche, but an Albert Einstein quote sums up my thoughts pretty well: “Everything should be made as simple as possible, but no simpler”.  I think CVSS could using a little simplifying.

Ambiguous RFC leads to Cross Site Scripting

By Patrick Toomey

Sometime in January I was on an application assessment and noticed that user input was being used to generate a link to another application.  In other words, I would send a request that looked like:

http://www.site1.com/test.jsp?var1=val1

and the application was generating some HTML that looked like:

<a href="http://www.site2.com/test2.jsp?var1=val1">Click Me</a>

This is not atypical and I have probably seen it a fair number of times in the past.  This functionality can be implemented in a number of ways, but the way it was implemented in this application was the following:

return "http://www.site2.com/test2.jsp?"+request.getQueryString();

So, depending on what  request.getQueryString() returns, this may be used for XSS.  In other words, submitting the following:

http://www.site1.com/test.jsp?var1=val1"><script>alert(1)</script>

could lead to pretty straightforward  XSS, with the above Java code generating the following HTML:

<a href="http://www.site2.com/test2.jsp?var1=val1">
<script>alert(1)</script>">Click Me</a>

Ok, before you chastise me for demonstrating basic XSS, let’s dig a bit deeper.  It turns out that that the above request fails to inject JavaScript into the generated link in Chrome, Safari, and Firefox.  However, it does work in Internet Explorer.  Chrome, Safari, and Firefox all URL encode the “,>, and < characters while IE encodes none of them.  For example, Safari encodes the request as follows:

GET /test.jsp?var1=val1%22%3E%3Cscript%3Ealert(1)%3C/script%3E HTTP/1.1

while IE sends the following:

GET /test.jsp?var1=val1"><script>alert(1)</script>

As can be seen, IE does not URL encode any of the characters, while Safari (et al) tend to URL encode values that might be misinterpreted in another context.  My memory is not what it once was, but I feel like I had run into this in the past and just passed it off as an IE(ism).  Every browser seems to have their own set of strange edge cases that only work in that particular browser (whether we are talking about security or functionally).  I would have probably just blown this off as another IEism, forgotten about it, and remembered it months down the line when I ran into it again in another application.  Instead, not a week goes by when a coworker emails me with the exact same observation.

In the email he, almost verbatim, described the application he was assessing and noted how IE seemed to be aberrant when it came to this edge case.  At this point I started thinking that maybe I had not run into this in the past, and maybe it was not just my memory failing me.  Maybe IE had changed their encoding rules and this behavior was introduced in a more recent version of IE.  A quick round of searching and I found this.  It turns out Imperva had submitted this exact issue to MSFT about a week before my coworker and I noticed the issue.  At this point I was totally confused.  Surely me, my coworker, and the engineer at Imperva noticing the issue within the same month could not be a coincidence.

After reading through the Imperva blog post I brought up my Windows XP VM to test this on every version of IE since 6 to see when this oversight was introduced.  Well, to cut things short, I found identical behavior in IE 6, 7, 8, 9, and 10 (had to test 10 in the Windows 8 public release).  I did not dig into prior releases of other browsers to see when they implemented the URL encoding of “,>, and < (among other characters), but if my poor memory serves me, I feel like this has been a common practice for quite a while.  Quoting from the Imperva blog post, Microsoft’s response  to the observed behavior was the following:

Thank you for writing to us.  The behavior you are describing is something that we are aware of and are evaluating for changes in future versions of IE, however it’s not something that we consider to be a security vulnerability that will be addressed in a security update.

So, is MSFT just being stubborn and knowingly violating the RFC?  Well, as far as I can tell, no.  I believe there are some other drafts, but the most current finalized RFC dealing with URIs is RFC 3986.  In particular, Section 2 talks about characters, reserved characters, unreserved characters, URL encoding, etc.  One would think that if you are going to use the terms “reserved characters” and “unreserved characters” that this would divide the world of all characters into the “reserved character” set and the “unreserved character” set.  That only makes sense, right?  Well, here is a list of the reserved characters:

":" / "/" / "?" / "#" / "[" / "]" / "@" / "!" / "$" / "&" / "'" 
/ "(" / ")" / "*" / "+" / "," / ";" / "="

and the unreserved characters:

ALPHA / DIGIT / "-" / "." / "_" / "~"

Conspicuously absent are the “, >, and < characters (as well as others).  This is strange for a number of reasons, one of which has to do with the fact that this exact issue is mentioned in RFC 1738 (RFC 3986 updated RFC 1738).  In RFC 1738 there is a section that explicitly mentions “Unsafe” characters.  This section, in part, states:

The characters "<" and ">" are unsafe because they are used as the
delimiters around URLs in free text; the quote mark (""") is used to
delimit URLs in some systems.

RFC 3986 doesn’t mention unsafe characters anywhere (correct me if I am wrong; it is easy to miss a line in a RFC).  It would appear that IE is actually not violating the RFC.  Instead, they just happened to have implemented their URL encoding scheme in a way that is in line with the  “reserved characters” and “unreserved characters” definitions, but different than everyone else.  My guess is that IE has left this in place for the same reason MSFT often leaves things in place….backward compatibility.  I see no reason why they would not prefer the more secure behavior if they were confident it would not break existing applications.  Moreover, I would imagine that they would have gladly implemented it the same way as everyone else if the RFC had actually unambiguously defined the expected behavior.

So, in the end, the fact that three different security engineers all noted the same odd IE behavior in the same month was actually just coincidence.  I guess I can only dream of a day when some sort of formal verification system will free us from RFC ambiguity.  But, for now, we can probably bank on these ambiguities continuing to introduce strange edge case security issues.

Keychain Dumper Updated for iOS 5

By Patrick Toomey

Direct link to keychaindumper (for those that want to skip the article and get straight to the code)

Last year I pushed a copy of Keychain Dumper to Github to help audit what sensitive information an application stores on an iOS device using Apple’s Keychain service.  I’ve received a few issue submissions on github regarding various issues people have had getting Keychain Dumper to work on iOS 5. I meant to look into it earlier, but I was not able to dedicate any time until this week. Besides a small update to the Makefile to make it compatible with the latest SDK, the core issue seemed to have something to do with code signing.

Using the directions I published last year the high level steps to dump keychain passwords were:

  1. Compile
  2. Push keychain_dumper to iOS device
  3. Use keychain_dumper to export all the required entitlements
  4. Use ldid to sign these entitlements into keychain_dumper
  5. Rerun keychain_dumper to dump all accessible keychain items
Steps 1-3 continue to work fine on iOS 5.  It is step 4 that breaks, and it just so happens that this step was the one step I completely took for granted, as I had never looked into how ldid works.  Originally, signing the entitlements into keychain_dumper was as simple as:
./keychain_dumper -e > /var/tmp/entitlements.xml
ldid -S/var/tmp/entitlements.xml keychain_dumper

However, on iOS 5 you get the following error when running ldid:

codesign_allocate: object: keychain_dumper1 malformed object
(unknown load command 8)
util/ldid.cpp(582): _assert(78:WEXITSTATUS(status) == 0)

Looking around for the code to ldid, I found that ldid actually doesn’t do much of the heavy lifting with regard to code signing, as ldid simply execs out to another command, codesign_allocate (the allocate variable below):

execlp(allocate, allocate, "-i", path, "-a", arch, ssize, "-o", temp, NULL);

The Cydia code for codesign_allocate looks to be based off of the odcctools project that was once open-source from Apple.  I am unclear, but it appears as though this codebase was eventually made closed-source by Apple, as the  code does not appear to have been updated anytime recently.  Digging into the code for codesign_allocate, the error above:

unknown load command 8

makes much more sense, as codesign_allocate parses each of the Mach-O headers, including each of the load commands that are part of Mach-O file structure.  It appears that load command 8 must have been added sometime between when I first released Keychain Dumper and now, as the version of codesign_allocate that comes with Cydia does not support parsing this header.  This header is responsible for determining the minimum required version of iOS for the application.  If someone knows a compile time flag to prevent this (and possibly other) unsupported header(s) from being used let me know and I’ll update the Makefile.  The other options to get the tool working again were to either update odcctools to work with the new Mach-O structure and/or figure out an alternative way of signing applications.

Historically there have been three ways to create pseudo-signatures for Cydia based applications (you can see them here).  The third uses sysctl, and is no longer valid, as Apple made some changes that make the relevant configuration entries read-only  The second option uses ldid, and was the approach I used originally.  The first uses the tools that come with OS X to create a self-signed certificate and use the command line development tools to sign your jailbroken iOS application with the self-signed certificate.  It appears as though the tools provided by Apple are basically the updated versions to the odcctools project referenced earlier.  The same codesign_allocate tool exists, and looks to be more up to date, with support for parsing all the relevant headers.  I decided to leverage the integrated tools, as it seems the best way to ensure compatibility going forward.  Using Apple’s tools I was able to sign the necessary entitlements into keychain_dumper and dump all the relevant Keychain items as before.  The steps for getting this to work are as follows:

  1. Open up the Keychain Access app located in /Applications/Utilties/Keychain Access
  2. From the application menu open Keychain Access -> Certificate Assistant -> Create a Certificate
  3. Enter a name for the certificate, and make note of this name, as you will need it later when you sign Keychain Dumper.  Make sure the Identity Type is “Self Signed Root” and the Certificate Type is “Code Signing”.  You don’t need to check the “Let me override defaults” unless you want to change other properties on the certificate (name, email, etc).
  4. Compile Keychain Dumper as instructed in the original blog post(condensed below):
ln -s /Developer/Platforms/iPhoneOS.platform/Developer/SDKs/iPhoneOS5.0.sdk/ sdk
ln -s /Developer/Platforms/iPhoneOS.platform/Developer toolchain
make
  1. scp the resulting keychain_dumper to your phone (any file system reference made here will be in /tmp)
  2. Dump the entitlements from the phone
./keychain_dumper -e > /var/tmp/entitlements.xml
  1. Copy the entitlements.xml file back to your development machine (or just cat the contents and copy/paste)
  2. Sign the application with the entitlements and certificate generated earlier (you must select “Allow” when prompted to allow the code signing tool to access the private key of the certificate we generated)
codesign -fs "Test Cert 1" --entitlements entitlements.xml keychain_dumper
  1. scp the resulting keychain_dumper to your phone (you can remove the original copy we uploaded in step 5)
  2. you should now be able to dump Keychain items as before
./keychain_dumper
  1. If all of the above worked you will see numerous entries that look similar to the following:
Service: Vendor1_Here
Account: remote
Entitlement Group: R96HGCUQ8V.*
Label: Generic
Field: data
Keychain Data: SenSiTive_PassWorD_Here

Now, the above directions work fine, but after dumping the passwords something caught my eye.  Notice the asterisk in the above entitlement group.  The Keychain system restricts access to individual entries according to the “Entitlement Group”, which is why we first dumped all of the entitlement groups used by applications on the phone and then signed those entitlements into the keychain_dumper binary.  I thought that maybe the asterisk in the above entitlement group had some sort of wildcarding properties.  So, as a proof of concept, I created a new entitlements.xml file that contains a single entitlement

<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE plist PUBLIC "-//Apple//DTD PLIST 1.0//EN"    "http://www.apple.com/DTDs/PropertyList-1.0.dtd">
<plist version="1.0">
 <dict>
     <key>keychain-access-groups</key>
     <array>
         <string>*</string>
     </array>
 </dict>
</plist>

The above file lists a single entitlement defined by “*”, which if the assumption holds, might wildcard all Entitlement Groups.  Using the above file, I recompiled Keychain Dumper and proceeded to resign the application using the wildcard entitlements.xml file.  Surprise….it works, and simplifies the workflow.  I have added this entitlements.xml file to the Git repository.  Instead of first needing to upload keychain_dumper to the phone, dump entitlements, and then sign the app, you can simply sign the application using this entitlements.xml file.  Moreover, the binary that is checked in to Git is already signed using the wildcard entitlement and should work with any jailbroken iOS device (running iOS 5), as there is no need to customize the binary for each user’s device/entitlements.

The consolidated directions can be found on the Github repository page, found here.   A direct link to the binary, that should hopeful work on any jailbroken iOS 5 device can be found here.

Apple Software Security…Thinking Different

By Patrick Toomey

Last year Charlie Miller presented some fuzzing results on OS X, including fuzzing PDF files through OS X’s Preview application. Out of 2.8 million PDF files Preview crashed approximately 160,000 times. Obviously one can’t state anything categorical about the exploitability of these, but it does speak to a general code quality issue within Apple. Similarly, the security community has rightfully chastised Apple for their implementation of ASLR, DEP, etc. These stories, along with those of recent OS X specific malware, have gotten a fair bit of attention in the security community, with most everyone agreeing that Apple is going to have to do better. However, should we have really expected anything different?  As with any software company Apple has their priorities and apparently they didn’t think cleanly handling arbitrarily fuzzed PDF files and ASLR were going to sell more Macs. Microsoft took the same approach (prioritizing features/user experience over security) for many years before they were forced to swallow a big heaping spoonful of medicine. It could even be argued that taking the opposite approach (valuing security over user convenience/experience) isn’t the right answer either. That is not to say that the reason Linux never took off on the desktop is because they were too focused on security. But, Linux has often placed a high degree of value on the technical underpinnings of the platform rather than rallying behind user experience (this is obviously changing a bit with the likes of Canonical). The take away is that each of these platforms has a target audience and none of them are doing anything altruistic and/or evil, they are simply each catering to a different market.

So, with the upcoming release of Mac OS X Lion, I got to thinking about where Apple is going from a security perspective.  With many companies you can easily predict what they will do because you can just point to the five things that other people are doing better, and the obvious next step is for them to do those things better as well.  But, that isn’t generally Apple’s style.  Years ago Apple had their “Think Different” campaign. While obviously marketing speak, there is an air of truth to it. Apple has never been afraid of doing things differently, running the risk of upsetting users, breaking backward compatibility, or violating common wisdom. Sometimes this way of thinking works out well for them and other times it is pure hubris and doesn’t pan out. But, I do believe they don’t make decisions lightly. As such, as much as many security people think Apple is simply blasé about security, I actually think they have had a strategy for some time now and it is definitely different than the rest.

One of the main roadblocks I’ve seen in the software security space is that very often the advice that is given out is “do better”. While contemporary protections like ASLR, DEP, etc have upped the ante of successful exploitation, it isn’t as if they are a cure all. So, in the end, we still advise developers to do better. All of the above mentioned protections take the perspective of trying to prevent memory corruption from letting an attacker execute code in the context of the user. But, what if an attacker does get code executing in the context of the user? Windows, Linux, and even Apple have thought about this issue as well, and each has their own approach. Linux has SELinux, AppArmor, and other such Linux security modules that let a developer get extremely fine grained with regard to the permissions set available to a given process. Windows has a similar idea with their “low integrity” process model as well as misc things you can do with “access tokens”, “securable objects”, etc.  And then we have Apple, who developed a model very similar to AppArmor called Seatbelt (though it hasn’t historically been widely used or exposed to developers). So, what’s the issue, these all seem like pretty good ideas, so why isn’t everyone using them?  Well, it’s all about the marketing.

So, what do I mean by marketing? I am mostly definitely not talking about a Mac vs. PC style campaign where Mac touts the benefits of Seatbelt over Windows’ approach to using process integrity.  No, I am speaking more of the marketing around selling their solution to developers.  Historically the only people incentivized to use these extra protections were developers that really wanted to do better. As an example, Chrome obviously has in incentive to do better, as one of their core tenants is around security. They want to be known as the secure browser, and as a result, they have gone through non-trivial effort to create a security architecture for their browser that is second to none. But, what is the incentive for the average developer to go through this much effort/bother?  Similar arguments can be made on the Linux side of the house, as there is really very little incentive for the average developer to care about and subsequently bother with these mechanisms, as security is very likely not going to be the thing that makes or breaks a project’s success.  However, from Apple’s perspective, over time if developer’s simply fail to do better, then it is Apple’s brand that is watered down.  So, what is Apple doing?  They are doing it their own way.

Sure, Apple will probably improve their ASLR as well as improve a number of other current deficiencies so that they eventually measure up to the likes of Microsoft and Linux (hopefully this is all coming in Lion).  But, that is all table stakes at this point.  What is Apple going to do that is inherently different than either Microsoft or Linux?  For, that we only need to look at the App Store model.  Apple found a goldmine in marketshare/mindshare with the iOS App Store. When Apple first released their iOS App Store much of the tech community rejected the idea of such a closed platform. However, as time has shown, there is a huge percentage of the population that is 100% OK with such a platform and there are benefits to such a platform from a security perspective. All iOS applications are jailed ala Seatbelt, preventing one application from touching any other application’s data. Also, each application passes through Apple as the gatekeeper. So, if/when an application is found to be doing something suspicious Apple has the capability to pull the App from users’ phones. Sure, there are parts of this that sound way too Orwellian, but there is absolutely value in this model for a large subset of users.

If we look at the growth of the iOS App Store you will notice one very important thing. Apple never told developers  to secure their applications. No, instead they presented developers with a proposition. They basically offered developers the following: If you are willing to trade off a bit of flexibility in your application, we will mange much of the marketing and distribution for your application. Essentially, if you were willing to make a few concessions, you might walk away with a decent paycheck. So, what were those concessions? Well, you can’t write an application that requires root, you can’t read/write to arbitrary locations on the filesystem, you can’t use undocumented and/or private APIs, you must let apple review your application to ensure you haven’t violated any of these concessions, as well as a few others rules.  The net result is that apple marketed their way to a more secure platform (it was a win for developers wanting to make money and it was a win for users who wanted a great user experience to buy applications)

So, what does this have to do with the Mac? Well, Apple released the Mac App Store not long ago. Just as with the iOS App Store, there are some concessions developers must make if they want to make it in the store. These concessions are very similar to those in the iOS App Store. The next version of OS X touts “application sandboxing” as a new feature (probably based off of Seatbelt). I nearly guarantee that in order for an application to make it into the Mac App Store it is going to have to be built in such a way it could be sandboxed (i.e. no root, no low-level filesystem access, etc). So, again, Apple is creating an incentive for doing better without having to idly sit by the sidelines hoping that developers do better all by themselves.  While not a mind-blowing concept, it is different.  This is what Apple is all about; they pull together some reasonably good piece of technology, though it is likely not revolutionary, and then play to their core strength…selling the idea to their customer.  In this case they created a product that is enticing to both developers and users, with the result being a great user experience, and one that just happens to have the nice side effect of helping developers do better.

Updated iPhone Keychain Dumper

By Patrick Toomey

So, apparently when I released the initial version of Keychain Dumper I failed to account for the fact that the keychain stores protected data in a few different tables within the keychain-2.db SQlite database.  Someone left a comment on the initial release letting me know they were not seeing mail accounts, etc being dumped.  A quick look at my code and the Apple development docs and I noticed that sure enough, I was only decrypting items with  the “kSecClassGenericPassword” security class.  I quickly updated the code to also decrypt the “kSecClassInternetPassword” security class as well.  There are additional security classes, but they don’t appear all that interesting to the average user (let me know if this isn’t correct).  So, I’ve updated the code on GitHub here.  I performed a 30 second check, and it appears to now dump all of the same items as before, as well as items from the Internet passwords table.  Let me know if anyone has any issues with the update.

On a final note, the README.md on GitHub mentions creating a symbolic link to build the project. The link in the readme refers to the iOS 4.2 SDK.  However, when I updated the tool I noticed that my SDK was now set to 4.3, and I had to update the symbolic link accordingly.  So, either just download the binary release on GitHub, or make sure you take note of your SDK version.

“Secure by Default” doesn’t seem to be ColdFusion’s motto

By Patrick Toomey

It is a trivial truth, but it doesn’t make it any less so: secure development is not easy.  Given the dynamics at play in the majority of companies many developers are incentivized to produce code as quickly as possible.  There are often promises made to customers, impending marketing efforts with immovable (ex. holiday related) deadlines, etc.  So, as much as I enjoy security from a purist’s standpoint, I also recognize security is obviously placed in the context of countless other, potentially equally valid, constraints.  Given that, it is always nice to see anything that can be done to help developers write more secure code.  For example, take a feature that has an infamous history: file uploads.  Years ago it felt like nearly every file upload I came across was riddled with horrible flaws.  Fast forward to the average Ruby on Rails assessment I run across.  The community has built a large library of tested components that can be modularly incorporated into new projects, including handling file uploads.  I’m not saying I have never run across a file upload vuln in a Rails app, but the proportion has dropped precipitously from the days when everyone was just inventing their own.  I think this is a direct result of Rails frameworks that really are repurposable and don’t require devs to rip them apart to reuse for their application.

This got me thinking about what should be the king of the hill when it comes to this type of object reuse; Rapid Application Development (RAD)/4GL languages/frameworks.  These platforms often include a good number of built-in mechanisms for handling common functionality.  The RAD  platform I have been taking a look at most recently is ColdFusion. Similar to many of the 4GL vendors, ColdFusion seems to target developers that want a “batteries included” kind of development environment.  In other words, many of these frameworks tend to allow for extremely rapid development, so long as the project doesn’t deviate too far from their target market.  In theory I love the idea, as many straight forward CRUD type applications can be rapidly developed without reinventing the wheel (or having to cobble together a bunch of disparate libraries).  Again, in theory, another benefit of these kinds of development environments is security.  These frameworks generally provide fairly abstracted interfaces to features that are security relevant.  Examples include session management, database interaction, encryption, etc.  However, it seems as though when I come across these RAD/4Gl development frameworks I am usually more let down than encouraged by what they offer.  As was the case with a recent assessment involving ColdFusion.

While not a complete list by any means, and these are not likely to surprise ColdFusion experts,  here are a few things that popped up pretty quickly on a recent assessment:

Session cookies use CFTOKEN by default – An upper bound for the entropy provided by this is around 26 bits (which is extremely small relative to other contemporary session tokens).  However, to make things worse, after performing the suite of entropy tests baked in to Burp, it is coming up with 5 bits of entropy.  I haven’t done a ton of analysis, but there are all sorts of bit level correlations going on.

Encryption uses an “insecure by default” primitive – By default, the “Encrypt” function builtin to ColdFusion takes whatever password you provide (of unlimited length), hashes that down to a 32 bit value, and uses that as the seed for a stream cipher.  Given the same password, the stream cipher will generate the same key stream, using no IV at all.  So, given access to any encrypted data, you can retrieve the plaintext and/or create your own ciphertext (it is the exact same kind of attack I described here ).

SQL Injection prevention – ColdFusion has a mechanism for parameterized queries built in to their database abstraction (cfqueryparam for those familiar with ColdFusion).  However, they also, by default, have a mechanism that escapes certain characters in the context of a database query.  I have seen devs find this out and abandon the parameterized queries because ColdFusion “handles this for us”.  However, this built-in protection has weird edge cases that make it’s use somewhat dangerous.  The most recent example I encountered with it was the following (i is a counter):

SELECT A WHERE B= ‘#Form.C#’ AND D= ‘#Evaluate(“Form.element_” & i)#’

By default, the bultin protections escape single quotes found in user posted values.  In other words, their automatic escaping would have been effective if the query were:

SELECT A WHERE B= ‘#Form.C#’ AND D= ‘#Form.element_” & i#’

But, it turns out the escaping occurs before the Evaluate is performed.  So, it essentially does the escape on the string “Form.element_1″ (assuming i is 1).  Since there are no illegal characters the Evaluate then retrieves the value from the form field.  The net result was complete access to the database.  This behavior is completely non-obvious.  I would have expected that everything, even after evaluation, between the ‘#’ delimiters to have been escaped.  There is nearly zero chance the average ColdFusion developer is going to understand when/why some things are escaped and some other things are not.

This isn’t to say that we should expect frameworks to guarantee error free code; we can always manage to mess things up :-)  The hope is that with these RAD environments it would make getting things wrong more difficult, by making the default built-ins relatively secure and force the developer to choose the less secure approach.  As it stands, ColdFusion allows developers to optionally use a random UUID for a session token.  ColdFusion optionally allows developers to specify a more contemporary encryption algorithm.  And again, ColdFusion optionally lets developers use parameterized queries.  It is just a shame all these things are options and not the defaults.  I know there is probably a dozen reasons why these things are implemented as they are, and I am sure much of it has to do with legacy (doesn’t everything involve legacy :-)).  That said, these platforms are often used by developers that don’t have the security savvy to make wise use of things that are optional, and depend on the platform to have thought about these things for them.

“Researchers steal iPhone passwords in 6 minutes”…true…but not the whole story

By Patrick Toomey

Direct link to keychaindumper (for those that want to skip the article and get straight to the code)

So, a few weeks ago a wave of articles hit the usual sites about research that came out of the Fraunhofer Institute (yes, the MP3 folks) regrading some issues found in Apple’s Keychain service.  The vast majority of the articles, while factually accurate, didn’t quite present the full details of what the researchers found.  What the researchers actually found was more nuanced than what was reported.  But, before we get to what they actually found, let’s bring everyone up to speed on Apple’s keychain service.

Apple’s keychain service is a library/API provided by Apple that developers can use to store sensitive information on an iOS device “securely” (a similar service is provided in Mac OS X).  The idea is that instead of storing sensitive information in plaintext configuration files, developers can leverage the keychain service to have the operating system store sensitive information securely on their behalf.  We’ll get into what is meant by “securely” in a minute, but at a high level the keychain encrypts (using a unique per-device key that cannot be exported off of the device) data stored in the keychain database and attempts to restrict which applications can access the stored data.  Each application on an iOS device has a unique “application-identifier” that is cryptographically signed into the application before being submitted to the Apple App store.  The keychain service restricts which data an application can access based on this identifier.  By default, applications can only access data associated with their own application-identifier.  Apple realized this was a bit restrictive, so they also created another mechanism that can be used to share data between applications by using “keychain-access-groups”.  As an example, a developer could release two distinct applications (each with their own application-identifier) and assign each of them a shared access group.  When writing/reading data to the keychain a developer can specify which access group to use.  By default, when no access group is specified, the application will use the unique application-identifier as the access group (thus limiting access to the application itself).  Ok, so that should be all we need to know about the Keychain.  If you want to dig a little deeper Apple has a good doc here.

Ok, so we know the keychain is basically a protected storage facility that the iOS kernel delegates read/write privileges to based on the cryptographic signature of each application.  These cryptographic signatures are known as “entitlements” in iOS parlance.  Essentially, an application must have the correct entitlement to access a given item in the keychain.  So, the most obvious way to go about attacking the keychain is to figure out a way to sign fake entitlements into an application (ok, patching the kernel would be another way to go, but that is a topic for another day).  As an example, if we can sign our application with the “apple” access group then we would be able to access any keychain item stored using this access group.  Hmmm…well, it just so happens that we can do exactly that with the “ldid” tool that is available in the Cydia repositories once you Jailbreak your iOS device.  When a user Jailbreak’s their phone, the portion of the kernel responsible for validating cryptographic signatures is patched so that any signature will validate. So, ldid basically allows you to sign an application using a bogus signature. But, because it is technically signed, a Jailbroken device will honor the signature as if it were from Apple itself.

Based on the above descrption, so long as we can determine all of the access groups that were used to store items in a user’s keychain, we should be able to dump all of them, sign our own application to be a member of all of them using ldid, and then be allowed access to every single keychain item in a user’s keychain.  So, how do we go about getting a list of all the access group entitlements we will need?  Well, the kechain is nothing more than a SQLite database stored in:

/var/Keychains/keychain-2.db

And, it turns out, the access group is stored with each item that is stored in the keychain database.  We can get a complete list of these groups with the following query:

SELECT DISTINCT agrp FROM genp

Once we have a list of all the access groups we just need to create an XML file that contains all of these groups and then sign our own application with ldid.  So, I created a tool that does exactly that called keychain_dumper.  You can first get a properly formatted XML document with all the entitlements you will need by doing the following:

./keychain_dumper -e > /var/tmp/entitlements.xml

You can then sign all of these entitlments into keychain_dumper itself (please note the lack of a space between the flag and the path argument):

ldid -S/var/tmp/entitlements.xml keychain_dumper

After that, you can dump all of the entries within the keychain:

./keychain_dumper

If all of the above worked you will see numerous entries that look similar to the following:

Service: Dropbox
Account: remote
Entitlement Group: R96HGCUQ8V.*
Label: Generic
Field: data
Keychain Data: SenSiTive_PassWorD_Here

Ok, so what does any of this have to do with what was being reported on a few weeks ago?  We basically just showed that you can in fact dump all of the keychain items using a jailbroken iOS device.  Here is where the discussion is more nuanced than what was reported.  The steps we took above will only dump the entire keychain on devices that have no PIN set or are currently unlocked.  If you set a PIN on your device, lock the device, and rerun the above steps, you will find that some keychain data items are returned, while others are not.  You will find a number of entries now look like this:

Service: Dropbox
Account: remote
Entitlement Group: R96HGCUQ8V.*
Label: Generic
Field: data
Keychain Data: <Not Accessible>

This fundamental point was either glossed over or simply ignored in every single article I happend to come across (I’m sure at least one person will find the article that does mention this point :-)).  This is an important point, as it completely reframes the discussion.  The way it was reported it looks like the point is to show how insecure iOS is.  In reality the point should have been to show how security is all about trading off various factors (security, convenience, etc).  This point was not lost on Apple, and the keychain allows developers to choose the appropriate level of security for their application.  Stealing a small section from the keychain document from Apple, they allow six levels of access for a given keychain item:

CFTypeRef kSecAttrAccessibleWhenUnlocked;
CFTypeRef kSecAttrAccessibleAfterFirstUnlock;
CFTypeRef kSecAttrAccessibleAlways;
CFTypeRef kSecAttrAccessibleWhenUnlockedThisDeviceOnly;
CFTypeRef kSecAttrAccessibleAfterFirstUnlockThisDeviceOnly;
CFTypeRef kSecAttrAccessibleAlwaysThisDeviceOnly;

The names are pretty self descriptive, but the main thing to focus in on is the “WhenUnlocked” accessibility constants.  If a developer chooses the “WhenUnlocked” constant then the keychain item is encrypted using a cryptographic key that is created using the user’s PIN as well as the per-device key mentioned above.  In other words, if a device is locked, the cryptographic key material does not exist on the phone to decrypt the related keychain item.  Thus, when the device is locked, keychain_dumper, despite having the correct entitlements, does not have the ability to access keychain items stored using the “WhenUnlocked” constant.  We won’t talk about the “ThisDeviceOnly” constant, but it is basically the most strict security constant available for a keychain item, as it prevents the items from being backed up through iTunes (see the Apple docs for more detail).

If a developer does not specify an accessibility constant, a keychain item will use “kSecAttrAccessibleWhenUnlocked”, which makes the item available only when the device is unlocked.  In other words, applications that store items in the keychain using the default security settings would not have been leaked using the approach used by Fraunhofer and/or keychain_dumper (I assume we are both just using the Keychain API as it is documented).  That said, quite a few items appear to be set with “kSecAttrAccessibleAlways”.  Such items include wireless access point passwords, MS Exchange passwords, etc.  So, what was Apple thinking; why does Apple let developers choose among all these options?  Well, let’s use some pretty typical use cases to think about it.  A user boots their phone and they expect their device to connect to their wireless access point without intervention.  I guess that requires that iOS be able to retrieve their access point’s password regardless of whether the device is locked or not.  How about MS Exchange?  Let’s say I lost my iPhone on the subway this morning.  Once I get to work I let the system administrator know and they proceed to initiate a remote wipe of my Exchange data.  Oh, right, my device would have to be able to login to the Exchange server, even when locked, for that to work.  So, Apple is left in the position of having to balance the privacy of user’s data with a number of use cases where less privacy is potentially worthwhile.  We can probably go through each keychain item and debate whether Apple chose the right accessibility constant for each service, but I think the main point still stands.

Wow…that turned out to be way longer than I thought it would be.  Anyway, if you want to grab the code for keychain_dumper to reproduce the above steps yourself you can grab the code on github.  I’ve included the source as well as a binary just in case you don’t want/have the developer tools on your machine.  Hopefully this tool will be useful for security professionals that are trying to evaluate whether an application has chosen the appropriate accessibility parameters during blackbox assessments. Oh, and if you want to read the original paper by Fraunhofer you can find that here.

Java Cisco Group Password Decrypter

By Patrick Toomey

For whatever reason I have found myself needing to “decrypt” Cisco VPN client group passwords throughout the years.  I say “decrypt” , as the value is technically encrypted using 3DES, but the mechanism used to decrypt the value is largely obfuscative (the cryptographic key is included in the ciphertext).  As such, the encryption used is largely incidental, and any simplistic substitution cipher would have protected the group password equally well (think newspaper cryptogram).

I can’t pinpoint all the root reasons, but it seems as though every couple months I’m needing a group password decrypted.  Sometimes I am simply moving a Cisco VPN profile from Windows to Linux.  Other times I’ve been on a  client application assessments/penetrations test where I’ve needed to gain access to the plaintext group password.  Regardless, I inevitably find myself googling for “cisco group password decrypter”.  A few different results are returned.  The top result is always a link here.  This site has a simple web app that will decrypt the group password for you, though it occurs server-side.  Being paranoid by trade, I am always apprehensive sending information to a third party if that information is considered sensitive (whether by me, our IT department, or a client).  I have no reason to think the referenced site is malicious, but it would not be in my best interest professionally not to be paranoid security conscious, particularly with client information.  The referenced site has a link to the original source code, and one can feel free to download, audit, compile, and use the provided tool to perform all of the decryption client-side.  That said, the linked file depends on libgcrypt.  That is fine if I am sitting in Linux, but not as great if I am on some new Windows box (ok, maybe I’m one of the few who doesn’t keep libgcrypt at the ready on my fresh Windows installs).  I’ve googled around to see if anything exists that is more portable.  I found a few things, including a link to a Java applet, but the developer seems to have lost the source code…..  So, laziness won, and I decided it would be easier to spend 30 minutes to write my own cross-platform Java version than spend any more time on Google.

An executable jar can be found  here and the source can be found here.  I am not a huge fan of Java GUI development, and thus leveraged the incredible GUI toolkit built in to NetBeans.  The referenced source code should compile cleanly in NetBeans if you want the GUI.  If you simply want to decrypt group passwords with no dependencies you can run a command line version by compiling the “GroupPasswordDecrypter” class file.  This file has zero dependencies on third-party libraries and should be sufficient for anyone that doesn’t feel compelled to use a GUI (me included).

As a quick example, I borrowed a sample encrypted group password from another server-side implementation .  The encrypted group password we would like to decrypt is

9196FE0075E359E6A2486905A1EFAE9A11D652B2C588EF3FBA15574237302B74C194EC7D0DD16645CB534D94CE85FEC4

Or, if you prefer the command line version

Hope it comes in handy!

 

How NOT to build your client-server security architecture

By: Patrick Toomey

Traditional client-server and web applications aren’t that different from a security standpoint.  Sure, native UI controls are great for a nice look and feel, and access to native OS APIs is great for creating a high performance application that integrates well with the target platform.  But, from a security standpoint, they are on equal footing.  Neither should trust the client, requiring that developers place all of the security relevant logic on the server.  By and large most web application developers seems to understand that the browser is not to be trusted.  That isn’t to say that all web application developers get it.  There are definitely those web apps that delegate a  too much to the browser, using hidden form fields inappropriately, depending on client-side JavaScript a bit too heavily, amongst other things.  But, on the whole, I have yet to see nearly the number of web applications that were designed with as many fundamental architectural flaws as seems to be common amongst niche vertical market client-server applications.  Wow, did I just say “niche vertical market client-server applications”? What does that mean?  I am talking about the types of client-server applications that were either developed in house, or were bought commercially, but serve a very specific task.  Before I get into trouble for generalizing, yes, I concede that my views are shaped by my experiences.  I am absolutely sure that there are scores of “niche vertical market client-server applications” that are rock solid from a security standpoint.  But, my experiences with these types of applications, combined with recent war stories of some colleagues, compelled me to rant…just a little.

The vast majority of the applications we assess are web based.  Sure, we look at our fair share of client-server applications, but I would say the ratio is probably on the order of 10 to 1.  For whatever reason, we had a string of more traditional client-server applications on our plate in the last few weeks.  Looking at the types of vulnerabilities identified in these applications, as well as other similar applications we have reviewed in the past, it seems as though there is fundamental gap in how a fair percentage of these developers are approaching security.  To illustrate, let’s listen in on a couple of web developers discussing a few pieces of their security architecture for a new web app they are about to deploy.

Web Dev 1:  So, we need to check to make sure only authenticated and authorized users are allowed to access our app.

Web Dev 2:  Yeah, it seems like this whole username password thing is pretty popular.  I vote we go with that.

Web Dev 1:  Ok, yeah, no need to reinvent the wheel here.  So, how should we go about validating the user’s username and password?

Web Dev 2:  Well, we could send an AJAX request to the server, passing the user’s username, and get back all of the user’s information, including their password, and simply validate it in JavaScript.  That way we can keep the logic really clean and simple on the server side of things.

Web Dev 1:  Yeah, that is awesome; with a little more effort we almost don’t need an application server.  We can do everything client-side.

Web Dev 2:  Yeah, we can do all the logic on the client in JavaScript and simply make calls to the backend to store and retrieve data.

Web Dev 1:  Wow, I think we have hit on the next big thing.  All we really need is to make some sort of direct ODBC connection to our database via JavaScript and we’ve nearly collapsed an entire tier out of the three tier architecture…incredible.

Web Dev 2:  Hmmm, yeah, it sounds pretty great, but we are putting a lot of our eggs into one basket.  What if someone figures out how view or manipulate traffic as it traverses the network.  They might be able to do bad stuff.

Web Dev 1:  Yeah, true, there must be some way to prevent our users from mucking around with the network traffic.

Web Dev 2:  I got it!!  We can use SSL.  That uses top notch cryptography that nobody can break.

Web Dev 1:  I think we have it.  But, I thought I remember reading about some sort of tool that lets user’s man in the middle their own SSL traffic, something to do with proxies and self-signed certificates or something?

Web Dev 2:  Hmmmm….that could be a problem, but it sounds like it would be really hard for users to do.

Web Dev 1:  Yeah, nobody will figure it out.  I think we are good to go.

Ok, the above conversation is obviously hyperbolic for effect, but can you get what I am saying here?  The above conversation just wouldn’t happen in a web app world.  Almost every sentence is filled with principles that are antithetical to the web security model.  Validating passwords on the client-side in the browser, connecting directly to the database from the browser (thankfully this doesn’t exist…I googled it just to make sure there wasn’t some inept RFC I wasn’t aware of), or treating SSL like it is some sort of magic crypto juju you can sprinkle over your application to protect yourself against users that have full control of the execution environment.

I have seen some web applications in a pretty sad state of disrepair, but I have never seen a web application that was architected with all of the above fundamental flaws.  Sadly, on the client-server side of things I have seen each of the above approaches used with regularity.  Applications that validate passwords by requesting them from the server, applications that treat the server simply as an ODBC connection, and applications that depend on SSL as their sole security control for preventing malicious user input and/or the disclosure of highly sensitive information (ex. those passwords being send back to the client from the server).  So, in short, if you are writing a client-server application, think of your client as the browser, and treat it with the same degree of trepidation.  The client cannot be trusted to enforce your security policy, and any expectation that it will do so is likely to result in an architectural flaw that may very well subvert the entirety of your security model.

Even if You Don’t Invent Your Own Crypto….It’s Still Hard

By: Patrick Toomey

So, yet another crypto related post.  Often times crypto related attacks fall on deaf ears, as for most people crypto is like magic (include me in that group every now and again), and it becomes difficult for the average developer to differentiate the risks associated with the attacks circulated within the security community.  There is a vast chasm between the crypto community and just about everyone else.  The cryptographers will talk about differential cryptanalysis, meet in the middle attacks, perfect forward secrecy, etc, etc.  Often, the cryptographers’ goals are different than your average developer.  Cryptographers are trying to achieve perfection.  For them, any flaw in a cryptographic protocol is something to be discussed, researched, and fixed.  As is a famous saying in the crypto community, “Attacks only get better”.  So, even the smallest flaw is worth their time and effort, as it may lead to a practical crack that will warrant more immediate attention.

Unlike cryptographers, for the vast majority of developers, the goal is “good enough” security.  Often times the result is not good enough, but the goal is admirable, as it doesn’t make sense to over invest in security.  As a result, there is a chasm between “good enough” for cryptographers and “good enough” for developers.  In this chasm are the real world attacks that are too boring for your average cryptographer to discuss, yet too subtle as to be obvious to your average developer.  Many of these attacks have nothing to do with the cryptography itself.  Instead, many real world crypto attacks have more to do with how developers piece together proven crypto building blocks insecurely.  It was on a recent assessment that I came upon the following story that perfectly illustrates how cryptography is difficult to get right, even when you think you have made an informed decision.

On a recent assessment a colleague of mine came across a typical SQL injection attack.  In fact, it was just the kind of SQL injection you hope for as an attacker, as it returned data to the user in a nice tabular output format.  After a bit of mucking around we managed to find where the system was storing credit card information for its users.  A few minutes later we had output that looked like the following:

Base64 Credit Card Numbers

Base64 Credit Card Numbers

Hmmm, base64 encoded credit card numbers.  I wonder what they look like when we base64 decode them.  A simple base64 decode resulted in the following hex values:

Base64 Decoded Credit Card Numbers

Base64 Decoded Credit Card Numbers

If you start trying to translate the above hex values to their ASCII equivalent you will quickly find that the values do not represent plaintext ASCII credit card numbers.  Our first guess was that they were likely encrypted.  So, step two is trying to figure out how they were encrypted.  We took a look at the base64 decoded values and observed a simple fact: the vast majority of them were either 15 or 16 bytes long.  Well, a 16 byte long ciphertext could be any 128 bit symmetric algorithm in simple ECB mode.  But, the 15 byte entries were evidence against that, as they would get padded out to 16 bytes due to the requirement that symmetric algorithms encrypt a full block (16 bytes in the case of AES).  So, my first thought was the use of a stream cipher.  In their most basic form, a stream cipher simply takes a secret value as input (the key), and begins to generate a pseudorandom output stream of bytes.  Encryption simply involves XORing the stream of random bytes with your plaintext.  Unlike a block cipher, there is no need to pad out your plaintext, as you only use as many pseudorandom bytes as is necessary to XOR with your plaintext.  Decryption is also simple, as XORing is a trivially invertible operation.  The user simply feeds the algorithm the same key, generates the same pseudorandom byte stream, and XORs that with the ciphertext.

To see why this works let’s take a look at how this works for a single plaintext/ciphertext.  Let us say we wish to encrypt the credit card number “4012888888881881″. The ASCII values in hex result in a plaintext array of the following values:

Plaintext Credit Card Number ASCII in Hex

Plaintext Credit Card Number ASCII in Hex

Also, let us assume that we are using a stream cipher with a strong key whose first 16 output bytes are:

Pseudorandom Byte Stream in Hex

Pseudorandom Byte Stream in Hex

When we XOR the plaintext with the pseudorandom stream we obtain the ciphertext.

Resulting Ciphertext

Resulting Ciphertext

To decrypt we simply take the ciphertext and perform the same XOR operation.  Let’s quickly take our ciphertext and see how we decrypt it.

Decrypted Ciphertext

Decrypted Ciphertext

When we XOR the ciphertext with the stream we obtain the plaintext we started with.  Through the whole encryption/decryption process we have effectively XORed the plaintext with the stream twice.  XORing anything with the same value twice is effectively a NOP on the original value.  Pictorially:

Full Encryption Decryption Cycle

Full Encryption Decryption Cycle

However, our examples so far have assumed knowledge of the pseudorandom stream.  Without knowledge of the key it is difficult to reproduce the pseudorandom stream, and thus very difficult to perform the steps we just outlined to obtain the plaintext for a given ciphertext.  Moreover, we don’t even know what algorithm is in use here.  Did the developers use RC4?  Did they try to make their own pseudorandom generator by repetitively hashing a secret key?  We didn’t really know, but  regardless of the algorithm, there is inherent risk with leveraging a stream cipher in its most basic form.  If the stream cipher uses a shared key, the pseudorandom stream will be identical every time it is initialized.  To test this conjecture we used several test user accounts  to register several credit cards.  Sure enough, identical credit card numbers resulted in identical ciphertext.  So, what can we do with this?  We have no idea what algorithm was used, and we have no idea of what the value of the secret key is.  However, it turns out that we don’t need any of that information.  To see why let’s first take a look at the hex of two encrypted credit cards we base64 decoded above.

First Encrypted Credit Card

First Encrypted Credit Card

The above shows the 16 ciphertext bytes for one of the credit card numbers in the database.

Second Encrypted Credit Card

Second Encrypted Credit Card

The above shows the 16 ciphertext bytes for a second credit card number in the database.

Let’s assume that the ciphertext from the first encrypted credit card was ours, where we know the plaintext ASCII hex values for the card.

Plaintext Bytes for the First Credit Card

Plaintext Bytes for the First Credit Card

Let’s also assume that the second credit card represents a credit card of unknown origin (i.e.  all we can observe is the ciphertext.  We have no ideas what the digits of the card are).  Similarly, without knowledge of the key, we have no idea how to generate the pseudorandom byte stream used to encrypt the cards.  But, what happens if we simply XOR these two ciphertexts together.

XOR of the Two Encrypted Credit Cards

XOR of the Two Encrypted Credit Cards

Well, that doesn’t really look like it got us anywhere, as we still don’t have anything that looks like an ASCII credit card number.  However, let’s take a look at what actually happened when we did the above operation.  We are confident that both credit card numbers were encrypted with the same pseudorandom byte stream (remember that encryption of the same credit card under different users resulted in the same ciphertext).  Furthermore, we know that the first credit card number is “4012888888881881″, as that is our own credit card number.  The key insight is that we have effectively XORed four different byte sequences together.  We have XORed the plaintext bytes of our credit card, the plaintext bytes of an unknown credit card, and the same pseudorandom sequence twice.

Encrypted Credit Card XOR Expanded

Encrypted Credit Card XOR Expanded

In the above picture the unknown plaintext credit card number is represented with a series of “YY”s.  Similarly, the unknown pseudorandom stream is represented with a series of “XX”s.  However, despite the fact that we don’t know the value of the pseudorandom stream, we do know they are the same.  As a result, whenever you XOR anything with the same value twice you effectively create a NOP. So, the above three XOR operations can be simplified to the following:

Simplified XOR of Encrypted Credit Cards

Simplified XOR of Encrypted Credit Cards

So, what we actually did by XORing the two ciphertexts is create a new value that is the XOR of the two plaintexts.  Wow, that is kind of cool.  We just got rid of the pseudorandom byte stream that is supposed to provide all the security without ever knowing the key.  But, what do we do now?  We still don’t have plaintext credit card numbers yet.  Ah, but we do!  You can think of what we now have as the unknown credit card number encrypted with our known credit card number as the pseudorandom stream.  In other words, we can simply XOR our result with our known credit card value again and we will obtain the resulting plaintext for the unknown credit card number.

Decrypted Credit Card

Decrypted Credit Card

One you translate the resulting hex into the equivalent ASCII string you obtain “5362016689976147″, a valid MasterCard number (obviously we have used test credit card numbers throughout this blog entry).  We can simply apply the above steps to recover the full unencrypted credit card numbers for the entire database.  We have effectively used our own encrypted credit card number as a key to decrypt all of the other unknown credit cards in the database.  This is a pretty impressive feat for not knowing the algorithm and/or the secret key used to encrypt all of the credit card information.

As it turns out, we eventually received source code for the functionality under investigation and discovered that the developers were using RC4.  RC4 has had its fair share of issues, and is generally not a recommended algorithm, but none of what we just discussed had to do with any weakness with RC4.  The above attack would have worked on any stream cipher used in the same way.  So, as is typical, the attack had more to do with how cryptography is used than the cryptographic algorithm itself.  We can’t say it enough….crypto is hard.  It just is.  There are just too many things that developers need to get right for crypto to be done correctly.  Algorithm selection, IVs, padding, modes of encryption, secret key generation, key expiration, key revocation…the list goes on and on.  We in the security community have been pounding the “don’t invent your own cryptography” mantra for a long time.  However, that has lead to a swarm of developers simply leveraging proven cryptography insecurely.  So, we need to take this one step further.  On average, it is best for developers to defer to proven solutions that abstract the notion of cryptography even further.  For example, GPG, Keyczar, et al have abstractions that simply let you encrypt/decrypt/sign/verify things based off of a secret that you provide.  They handle proper padding, encryption mode selection, integrity protection, etc.  That is not to say that there aren’t use cases for needing to delve into the weeds sometimes, but for the average use case a trusted library is your friend.