MPTCP Roams Free (By Default!) – OS X Yosemite

Further to the BlackHat USA Work by Patrick Thomas (@coffeetocode) and I (@secvalve).

MPTCP is enabled by default in Mac OS X Yosemite. So we can expect to see Multipath TCP on most networks, and on a total of tens to hundreds of millions of devices.

Embedded image permalink

Thanks to Ilias Marinos (@marinosi) who tripped my twitter search bot 

More to come…. We have stuff as yet unreleased that has suddenly become VERY relevant.

Multipath TCP – BlackHat Briefings Teaser

Multipath TCP: Breaking Today’s networks with Tomorrow’s Protocols. is being presented at Blackhat USA this year by Me (Catherine Pearce @secvalve) as well as Patrick Thomas @coffeetocode. Here is a bit of a tease, it’s a couple of weeks out yet, but we’re really looking forward to it.

Come see us at Black Hat Briefings in South Seas AB, on Wednesday at 3:30pm.

(UPDATE 8/14: A followup post and the talk slides are now online.)

What is multipath TCP?

Multipath TCP is a backwards-compatible modification that allows a core networking protocol, TCP to talk over multiple paths at the same time. In short, Multipath TCP decouples TCP from a specific IP address, and it also allows you to add and remove network addresses on the fly.

Multipath TCP in brief

Multipath TCP splits connection data across N different TCP subflows



Why do I care?

MPTCP Changes things for security in a few key ways:

  • Breaks Traffic Inspection – If you’re inspecting traffic you need to be able to correlate and reassemble it. We haven’t found a single security technology which does so currently.
  • Changes network Trust models – Multipath TCP allows you to spread traffic around, and also remove the inherent trust you place in any single network provider. With MPTCP it becomes much harder for a single network provider to undetectably alter or sniff your traffic unless they collaborate with the other ones you are using for that connection.
  • Creates ambiguity about incoming and outgoing connections – The protocol allows a client to tell a server that it has another address which the server may connect back to. To a firewall that doesn’t understand MPTCP it looks like an outgoing connection.
MPTCP and Reverse connections

MPTCP can have outbound incoming connections!?



Backwards compatible

Did I mention that MPTCP is designed to be backwards compatible and runs on >= 85% of existing network infrastructure [How Hard Can It Be? Designing and Implementing a Deployable Multipath TCP ]

Like IPv6, this is a technology that will slowly appear in network devices and can cause serious security side effects if not understood and properly managed. MPTCP affects far more than addressing though, it also fundamentally changes how TCP traffic flows over networks.

MPTCP confuses your existing approaches and tools

If you don’t understand MPTCP, things get really confusing. Take this wireshark “follow TCP stream” where I follow an http connection. Why does the server reply to an invalid request this way?

MPTCP Fragmentation confuses wireshark

Why does the web server reply to this garbled message? – MPTCP Confuses even tools that support it


Network flows can also become a lot more complicated. Why talk over a single network path when you can talk through all possible paths?


That’s what your non MPTCP-aware flows look like.

But, if we are able to understand it then it makes a lot more sense:


What are the implications?

Technologies are changing, and multipath technologies look like a sure thing in a decade or two. But, security isn’t keeping up with the new challenges, let alone the new technologies.

  1. I can use MPTCP to break your IDS, DLP, and many application-layer security devices today.
  2. There are security implications in multipath communications that we cannot patch our existing tools to cope with, we need to change how we do things. Right now tools can correlate flows from different points on the network, but they are incapable of handling data when part of it flows down one path and part of it flows down another.

To illustrate point 2:

What if you saw this across two subflows… Can you work out what they should be?

  • Thquicown fox jps ov the az og
  • E k brumerlyd.

Highlight the text below to see what that reassembles to

[The quick brown fox jumps over the lazy dog.]

Follow up with our Black Hat session as we discuss MPTCP and the effect on security in yet more detail. We ma not be ready for the future, but it is fast approach, just ask Siri.

How does your security decide what to do with a random fragment of a communication?



XSS hunting through forensic standards-analysis.

By Kate Pearce

Brief: Web standards are complex, with request encoding Microsoft loses if they are “compliant” and they also lose if they are not.

“Ambiguous RFC leads to Cross Site Scripting “ was posted by a colleague at Neohapsis Labs (Patrick Toomey) a few weeks ago, and a related post was also put up by Rob Rachwald at Imperva’s blog. As I have read through some of the associated RFCs many times I decided to dig a little deeper. I journeyed through the final version of seven RFCs defining three things (URL, URI and HTTP ), in an attempt to track down just how this issue arrived in the standards and how the Internet Explorer behavior fitted in.

What I seem to have found is a situation that illustrates the complexity of standards development, shows how unintended consequences can develop during development, and also, surprisingly, how Microsoft is placed in a lose-lose situation with Internet explorer and standards compliance. It appears that if Microsoft is fully, and minimally, standards compliant then they need to exhibit behavior that the other browsers do not. Should they add “safe” behavior then they not only break some legacy applications, but will need to add behavior that the standard isn’t entirely clear on the status of.

Microsoft loses if they are “compliant” and they lose if they are not. And that presumes you can even work out which standard is applicable in the first place….

Recap of the issue at hand:

Cross Site Scripting occurs when a web application or server takes unvalidated and unsanitized user input and displays it back in such a way that any active (or otherwise harmful) content embedded in it (such as JavaScript) will be executed. This happens because web browsers generally treat anything that is received from a web server as having originated there. By sending malicious content through a web server first web browsers lose any associated context that content has, and instead associates it all with the web server. Patrick’s post has a walkthrough of an example of this and how it can be abused.

The specific XSS related problem of inconsistent percent-encoding of sensitive characters in requests across different web browsers is an interesting one. Percent encoding means that if an application directly repeats unsafe input it will be sent to the server in a form with a percent sign and the ascii value, rather than raw form. So an injected input like”><script>alert(123)</script><&#8221;

will become the following in the webpage source code where it says “hello NAME”:


which will not, and cannot, execute as it is neither valid JavaScript nor Valid HTML.

Well, it turns out that Firefox, Chrome, and Safari all perform this encoding of request parameters while Internet Explorer does not. Therefore any website which naievely repeats input from URL parameters may find that its IE wielding users are vulnerable to XSS while those using other browsers are not.

Thus it appears that Internet Explorer increases the risk of its users to Cross-Site Scripting.

Latest standards

Both previous posts on this issue list RFC 3986, “URI Generic Syntax”, as the root of the problem, because it lists reserved characters but neglects to mention the XML/HTML delimiters of < and > (page 12, section 2.2).

    reserved    = gen-delims / sub-delims

    gen-delims  = ":" / "/" / "?" / "#" / "[" / "]" / "@"

    sub-delims  = "!" / "$" / "&" / "'" / "(" / ")"
                / "*" / "+" / "," / ";" / "="
Interestingly, these are not listed in unreserved characters at the bottom of the page either:
   Characters that are allowed in a URI but do not have a reserved
   purpose are called unreserved.  These include uppercase and lowercase
   letters, decimal digits, hyphen, period, underscore, and tilde.

      unreserved  = ALPHA / DIGIT / "-" / "." / "_" / "~"

So, should they be encoded or not? They are not explicitly unsafe, nor are they explicitly safe!

“Family” history

Patrick mentions that RFC 1738Uniform Resource Locators” (which RFC 3986 above updated) specifically mentioned < and > as unsafe on page 2:

   The characters "<" and ">" are unsafe because they are used as the
   delimiters around URLs in free text; the quote mark (""") is used to
   delimit URLs in some systems.  The character "#" is unsafe and should
   always be encoded because it is used in World Wide Web and in other
   systems to delimit a URL from a fragment/anchor identifier that might
   follow it.

However, in between the times of these two standards it occurred to me that there are other players. Namely, RFC 2396 which was made obsolecent by RFC 3986, and RFC 1808 which was made obsolescent by 2396. Interestingly RFC 1738 states that it is updated by 1808, but 1808 doesn’t mention it updates 1738. Note that 1808 is only a partial update to 1738, as it is only concerned with relative URLs.

With this chain we have, in increasing time going down:

RFC 1738
Uniform Resource Locators (URL)


RFC 1808
Relative Uniform Resource Locators


RFC 2396
Uniform Resource Identifiers (URI): Generic Syntax


RFC 3986
Uniform Resource Identifier (URI): Generic Syntax

At the top of this chain we have < and > being encoded, but at the bottom we don’t. What happened in between?

I’ll get to that soon, but first I have to introduce another RFC family, the HTTP family of RFCs.

“Neighborly” history.

Since HTTP is really what we are concerned with, (it uses URI’s to find resources) we need to look at the specifications for HTTP too.

Interestingly, the first IETF HTTP standard, RFC 1945 Hypertext Transfer Protocol — HTTP/1.0, had < and > as unsafe and required encoding (referencing RFC 1808), as did the first HTTP/1.1 RFC 2068, but the latest HTTP RFC, RFC 2616
Hypertext Transfer Protocol — HTTP/1.1 does not state that they have to be encoded explicitly (instead referencing RFC 2396 on page 19).

   Characters other than those in the "reserved" and "unsafe" sets (see
   RFC 2396 [42]) are equivalent to their ""%" HEX HEX" encoding.

   For example, the following three URIs are equivalent:

It does state that to be in an HTTP parameter value they need to be inside double quotes though (RFC 2616 page 16).

   Many HTTP/1.1 header field values consist of words separated by LWS
   or special characters. These special characters MUST be in a quoted
   string to be used within a parameter value (as defined in section

       token          = 1*<any CHAR except CTLs or separators>
       separators     = "(" | ")" | "<" | ">" | "@"
                      | "," | ";" | ":" | "\" | <">
                      | "/" | "[" | "]" | "?" | "="
                      | "{" | "}" | SP | HT

So, as of HTTP version 1.1 we have < and > indirectly requiring hashing (via RFC 2396). But, the HTTP protocol no longer requires encoding in addition to 2616, leaving the HTTP protocol potentially vulnerable. But that’s OK, because RFC 2396 still offers protection (RFC 2396 page 9):

   The angle-bracket "<" and ">" and double-quote (") characters are
   excluded because they are often used as the delimiters around URI in
   text documents and protocol fields.  The character "#" is excluded
   because it is used to delimit a URI from a fragment identifier in URI
   references (Section 4). The percent character "%" is excluded because
   it is used for the encoding of escaped characters.

   delims      = "<" | ">" | "#" | "%" | <">

The nail in the coffin, Updating URL Generic Syntax.

Then, the actual issue occurred. RFC 3986 Updated 1738, made 2396 obsolete, and made a slight change (RFC 3986 Page 11/12):

   URIs include components and subcomponents that are delimited by
   characters in the "reserved" set.  These characters are called
   "reserved" because they may (or may not) be defined as delimiters by
   the generic syntax, by each scheme-specific syntax, or by the
   implementation-specific syntax of a URI's dereferencing algorithm.
   If data for a URI component would conflict with a reserved
   character's purpose as a delimiter, then the conflicting data must be
   percent-encoded before the URI is formed.
   reserved    = gen-delims / sub-delims

      gen-delims  = ":" / "/" / "?" / "#" / "[" / "]" / "@"

      sub-delims  = "!" / "$" / "&" / "'" / "(" / ")"
                  / "*" / "+" / "," / ";" / "="

Notice something missing? No more < or > (or % or ” for that matter, but that’s more complicated).

Maybe this RFC isn’t ambiguous though? Consider the line from the except above (RFC 3986 Page 11):

“If Data for a URI component would conflict with a reserved character’s purpose as a delimiter, then the conflicting data must be percent encoded before the URI is formed”

Here’s the issue: the later RFC, 3986 is referring to delimiters of URI’s, whereas RFC 2396 is referring to delimiters in content (ostensibly not it’s job as a URI standard).

Summary and timeline

In short the problem is: HTTP shifted decisions about it’s own content to an RFC for URI, that URI RFC is now obsolete and replaced by another which does not offer this protection.

URI Timeline HTTP TimeLine Notes Requires encoding in URI family? Require Encoding in HTTP family?
1738 URL (updated by 1738) Yes N/A
1808 Relative URL (updates 1738) Yes N/A
1945 HTTP 1.0 Yes Yes
2068 HTTP 1.1 Yes Yes
2396 URI Generic Yes Yes
2616 HTTP 1.1 Yes No
3986 URI Generic No No

So the error was introduced into HTTP in RFC 2616 but not manifest until RFC 3986 removed the mitigations from the URL syntax.

Implications and other considerations

There are a few implications that come to mind, most notably who is responsible for a decision about something in a specification, and whether this particular case may be leading to multiple-encoding vulnerabilities in applications.

Controlling responsibility for functionality in standards.

One of the core problems here was that early on an HTTP standard shifted control of a content-level decision to a protocol, and that protocol later removed the constraints in it that were there for the purposes of HTTP. Early on in this history we had two non-conflicting layers of protection, but by the end there were none. The problem was that while this may appear conceptually that these two protocols are a protocol stack, with no dependencies relying upon another layer this is not the case in practice:

How it seems HTTP and URI interact, with HTTP sitting on top of URI syntax making no cross-dependent assumptions

They actually intertwine slightly.


When developing your own standards and protocols you need to carefully map out who own what, and make security decisions of data in your component based upon your component alone, and not based upon unfounded and potentially dangerous assumptions about the behavior of another component. Another common example is when web applications presume the incoming TCP/IP details or referrer header prove something. The former relies upon TCP/IP not being spoofed while the latter presumes they are using a non-compromised web browser.


One potential problem with this inconsistent encoding across web browsers is that it may lead developers to decode their incoming data multiple times, or to simply keep decoding incoming requests to their web applications until they decode no more. This is so that all their applications can see the same data to process. But this may be leading developers to introduce multiple-decode vulnerabilities in their applications.

Encoding can offer a degree of protection against some injection attacks, but this is not always the case as it can sometimes introduce them. Furthermore, often web servers, application components or the application themselves will transparently decode percent encoded requests transparently and on-the-fly. When an application, or its architecture, do this decoding in unanticipated ways you get double and triple-encoding vulnerabilities.

For example, %25 is a percent character and %27 is an apostrophe (‘), so %2527 can be double-decoded first to %27, and then to an apostrophe (‘). %252527 is triple encoded , %25252527 is quadruple etc. This can sometimes introduce errors such as sql injection in applications that check the input (and sometimes its first decoded variant) for unsafe input (such as apostrophes) rather than using safe mechanisms like SQL parameterized statements.

If you ever have or suspect you application (or a component in its architecture) ensure that:

1. Validation checks are made unnecessary through using safe techniques where possible,

2. That where required to be used validation checks are made as close possible to the usage of the data,

3. That all security testing you do checks at least triple-decoded variants.

Pass the iOS Privacy Salt – Hashing Does NOT Guarantee Privacy.

By Kate Pearce, Neohapsis & Neolabs

There has been a lot of concern and online chatter about iPhone/mobile applications and the private data that some send to various parties. Starting with the discovery of Path sending your entire address book to their servers, it has since also been revealed that other applications do the same thing. The other offenders include Facebook, Twitter, Instagram, Foursquare, Foodspotting, Yelp, and Gowalla. This corresponds nicely with some research I have been doing into device ID leakage on mobile devices, where I have seen the same leakages, excuses, and techniques applied and abused as those discussed around the address book leakages.

I have observed a few posts discussing the issues proposing solutions. These solutions range from requiring iOS to request permission for address book access (as it does for location) and advising developers to hash sensitive data that they send through and compare hashes server side.

The first idea is a very good one, I see few reasons a device geolocation is less sensitive than its address book. The second one as given by is only partial advice however, and if taken as it is given in Martin May’s post, or Matt Gemmel’s arguments;  it will not solve the privacy problems on its own. This is because 1. anonymised data isn’t anonymous, and 2. no matter what hashing algorithm you use, if the input material is sufficiently constrained you can compute, or precompute all possible values.

Martin May’s two characteristics of a hash [link] :

  • Identical inputs will yield the same hash
  • It is virtually impossible to deduce the original input from a hash if a strong hashing algorithm is used.

This is because, of these two characteristics of a hash the privacy implications of first are not fully discussed, and the second is incorrect as stated.

 Hashing will not solve the privacy concerns because:

  • Hashing Data does not Guarantee Privacy (When the same data is input)
  • Hashing Data does not Guarantee Secrecy (When the input values are constrained)

The reasons not discussed for this are centered on the fact that real world input is constrained, not infinite. Telephone numbers are an extreme case of this, as I will discuss later.

A quick primer on hashing

Hashing is a destructive, theoretically one-way process where some data is taken and put through an algorithm to produce some output that is a shadow of the input. Like a shadow, the same output is always produced by the same input, but not the other way around. (Same car, same shadow).

A very simple example of a hashing function is the modulus (or remainder). For instance the output from 3 mod 2 is the remainder when 3 is divided by 2, or 1. The percent sign is commonly used in programming languages to denote this operation, so similarly

                1 % 3 is 1,             2 % 3 is 2              3 % 3 is 0              4 % 3 is 1              5 % 3 is 2       etc

If you take some input, you get the same output every time from the same hashing function. The reason the hashing process is one way is because it intentionally discards some data about the original. This results in what are called collisions, and we can see some in our earlier example using mod 3, 1 and 4 give the same hash, as do 2 and 5. The example given will cause collisions approximately one time in 1, however modern strong hashing functions are a great deal more complex than modulo 3. Even the “very broken” MD5 has collisions occur only one time in every 2^24 or 1 in ~17 000 000.

A key point is that, with a hashing algorithm for any output there are theoretically an infinite number of inputs that can give it and thus it is a one-way, irreversible, process.

A second key point is that any input gives the same output every time. So, by checking if the hashes of two items are the same you can be pretty sure they are from the same source material.

Cooking Some Phone Number Hash(es)

(All calculations are approximate, if I’m not out by two orders of magnitude then…)

Phone numbers conform to a rather well known format, or set of formats. A modern GPU can run about 20 million hashes per second (2*10^7), or 1.7  trillion (1.7 *10 11) per day. So, how does this fit with possible phone numbers?

A pretty standard phone number is made up of 1-3 digits for a country code, 3 local code, and 7 numbers, with perhaps 4 for the extension.

So, we have the following range of numbers:

0000000000000-0000 to 9999999999999-0000

Or, 10^13 possible numbers… About 60 days work to compute all possible values (and a LOT of storage space…)

If we now represent it in a few other forms that may occur to programmers…

+001 (234) 567-8910, 0012345678910, 001-234-5678910, 0012345678910(US), 001(234)5678910

We have maybe 10-20 times that, or several year’s calculations…

But, real world phone numbers don’t fill all possible values. For instance, take a US phone number. It is also made up of the country code, 3 for the local code , and 7 numbers, with perhaps 4 for the extension. But:

  • The country code is known:
  • The area code is only about 35% used since only 350 values are in use
  • The 7 digit codes are not completely full (let’s guess 80%)
  • Most numbers do not use extensions (let’s say 5% use them

Now, we only have 350 * (10 000 000 *.8) * 1.05 or 2.94 billion combinations (2.94*10^9). That is only a little over two minutes on a modern GPU. Even allowing for different representations of numbers you could store that in a few of gigabytes of RAM for instant lookup, or recalculate every time and take longer. This is what is called a time space tradeoff, the space of the memory or the time to recalculate.

Anyway, the two takeaways for our discussion here regarding privacy are:

1. Every unique output value probably corresponds to a unique input value, so this hashing anonymisation still has privacy concerns.
Since possible phone numbers are significantly fewer than the collision chance of even a broken hashing algorithm there is probably little chance of collisions.

2. Phone numbers can be reverse computed from raw hashes alone
Because of the known constraints of input values It is possible to either brute force reverse values, or to build a reasonable sized rainbow table on a modern system.

Hashing Does NOT Guarantee Privacy

Anonymising data by removing specific user identifying information but leaving in unique identifiers does not work to assuage privacy concerns. This is because often clues are in the data, or in linkages between the data. AOL learned this the hard way when they released “anonymised” search data.

Furthermore, the network effect can reveal a lot about you, how many people you connect to, and how many they connect to can be a powerful identifier of you. Not to mention predict a lot of things like your career area and salary point (since more connections tends to mean richer).

For a good discussion of some of the privacy issues related to hashes see Matt Gemmell’s post, Hashing for Privacy in social apps.

Mobile apps also often send the device hardware identifier (which cannot be changed or removed) to servers and advertising networks. And I have also observed the hash of this (or the WiFi MAC address) sent through. This hardly helps accomplish anything, as anyone who knows the device ID can hash it and look for that, and anyone who knows the hash can look for it, just as with the phone numbers. This hash is equally unique to my device, and unable to be changed.

Hashing Does not equal Secrecy

As discussed under “cooking some hash(es)” it is possible to work back from a hash to the input since we know some of the constraints operating upon phone numbers. Furthermore, even if we are not sure exactly how you are hashing data then we can simply put test data in and look for known hashes of it. If I know what 123456789 hashes to and I see it in the output, then I know how your app is hashing phone numbers.

The Full Solution to Privacy and Secrecy: Salt

Both of these issues can be greatly helped by increasing the complexity of the input into the hash function. This can both remove the tendency for anonymised data to carry identical identifiers across instances, and also reduce the chance of it becoming feasible to reverse-calculate all possible values. Unfortunately there is no perfect solution to this if user-matching functionality comes first.

The correct solution as it should be used to store passwords, entry specific salting (for example with bcrypt),  is not feasible for a matching algorithm as it will only work for comparing hashed input to stored hashes, and it will not work for comparing stored hashes to stored hashes.

However, if you as a developer are determined to make a server side matching service for your users, then you need to apply a hybrid approach. This is not good practice for highly sensitive information, but it should retain the functionality needed for server side matching.

Your first privacy step is to make sure your hashes do not match those collected or used by anyone else, do this by adding some constant secret to them, a process called salting.

e.g., adding 9835476579080945368095468905486 to the start of every number before you hash

This will make all of your hashes different to those used by any other developer, but will still compare them properly. The same input will give the same output.

However, there is still a problem – If your secret salt is leaked or disclosed the reversing attacks outlined earlier become possible. To avoid this, increase the complexity of input by hashing more complex data. So, rather than just hashing the phone number, hash the name, email, and phone number together. This does introduce the problem of causing hashes to disagree if any part of the input differs by misspelling, typo’s etc…

The best way to protect your user’s data from disclosure, and your reputation from damage due to a privacy breach:

  • Don’t collect or send sensitive user data or hashes in the first place – using the security principle of least privilege.
  • Ask for access in a very obvious and unambiguous way – informed consent.

Hit me/us up on twitter ( @neohapsis or @secvalve) if you have any comments or discussion. (Especially if I made an error!)

[Update] Added author byline and clarified some wording.

Groundhog Day in the Application Security World

By Kate Pearce, a Security Consultant and Researcher at Neohapsis

Throughout the US on Groundhog Day, an inordinate amount of media attention will be given to small furry creatures and whether or not they emerge into bright sunlight or cloudy skies. In a tradition that may seem rather topsy-turvy to those not familiar with it, the story says that if the groundhog sees his shadow (indicating the sun is shining), he returns to his hole to sleep for six more weeks and avoid the winter weather that is to come.

Similarly, when a company comes into the world of security and begins to endure the glare of security testing, the shadow of what they find can be enough to send them back into hiding. However, with the right preparation and mindset, businesses can not only withstand the sight of insecurity, they can begin to make meaningful and incremental improvements to ensure that the next time they face the sun the shadow is far less intimidating.

Hundreds or thousands of issues – Why?

It is not uncommon for a Neohapsis consultant to find hundreds of potential issues to sort through when assessing a legacy application or website for the first time. This can be due to a number of reasons, but the most prominent are:

  1. Security tools that are paranoid/badly tuned/misunderstood
  2. Lack of developer security awareness
  3. Threats and technologies have evolved since the application was designed/deployed/developed

Security Tools that are Paranoid/Badly Tuned/Misunderstood

Security testing and auditing tools, by their nature, have to be flexible and able to work in most environments and at various levels of paranoia. Because of this, if they are not configured and interpreted with the specifics of your application in mind they will often find a large number of issues, of which the majority are noise that should be ignored until the more important issues are fixed. If you have a serious, unauthenticated, SQL injection that exposes plain-text credit card and payment details, you probably shouldn’t a moment’s thought stressing about whether your website allows 4 or 5 failed logins before locking an account.

Lack of Developer Security Awareness

Developers are human (at least in my experience!), and have all the usual foibles of humanity. They are affected by business pressures to release first and fix bugs later, with the result that security bugs may be de-prioritized down as “no-one will find that” and so “later” never comes. Developers also are often taught about security as an addition rather than a core concept. For instance, when I was learning programming, I was first taught to construct SQL strings and verbatim webpage output and only much later to use parameterized queries and HTML encoding. As a result, even though I know better, I sometimes find myself falling into bad practices that could introduce SQL injection or cross-site scripting, as the practices that introduce these threats come more naturally to me than the secure equivalents.

Threats and Technologies have Evolved Since the Application was Designed/Deployed/Developed

To make it even harder to manage security, many legacy applications are developed in old technologies which are either unaware of security issues, have no way of dealing with them, or both. For instance, while SQL injection has been known about for around 15 years, and cross-site scripting a little less than that, some are far more recent, such as clickjacking and CSS history stealing.

When an application was developed without awareness of a threat, it is often more vulnerable to it, and when it was built on a technology that was less mature in approaching the threat remediating the issues can be far more difficult. For instance, try remediating SQL injection in a legacy ASP application by changing queries from string concatenation to parameterized queries (ADODB objects aren’t exactly elegant to use!).

Dealing with issues

Once you have found issues, then comes the daunting task of prioritizing, managing, and preventing their reoccurrence. This is the part that can bring the shock, and the part that can require the most care, as this is a task in managing complexity.

The response to issues requires not only looking at what you have found previously, but also what you have to do, and where you want to go. Breaking this down:

  1. Understand the Past – Deal with existing issues
  2. Manage the Present – Remedy old issues, prevent introduction of new issues where possible
  3.  Prepare for the Future – Expect new threats to arise

Understand the Past – Deal with Existing Issues

When dealing with security reports, it is important to always be psychologically and organizationally prepared for what you find. As already discussed, this is often unpleasant and the first reactions can lead to dangerous behaviors such as overreaction (“fire the person responsible”) or disillusionment (“we couldn’t possibly fix all that!”). The initial results may be frightening, but flight is not an option, so you need to fight.

To understand what you have in front of you, and to react appropriately, it is imperative that the person interpreting the results understands the tools used to develop the application; the threats surrounding the application; and the security tool and its results. If your organization is not confident in this ability, consider getting outside help or consultants (such as Neohapsis) in to explain the background and context of your findings.

 Manage the present – Remedy old issues, prevent introduction of new issues where possible

Much like any software bug or defect, once you have an idea of what your overall results mean you should start making sense of them. This can be greatly aided through the use of a system (such as Neohapsis Security Manager) which can take vulnerability data from a large number of sources and track issues across time in a similar way to a bug tracker.

Issues found should then be dealt with in order of the threat they present to your application and organization. We have often observed a tendency to go for the vulnerabilities labeled as “critical” by a tool, irrespective of their meaning in the context of your business and application. A SQL injection bug in your administration interface that is only accessible by trusted users is probably a lot less serious than a logic flaw that allows users to order items and modify the price communicated and charged to zero.

Also, if required, your organization should rapidly institute training and awareness programs so that no more avoidable issues are introduced. This can be aided by integrating security testing into your QA and pre-production testing.

 Prepare for the future – Expect new threats to arise

Nevertheless, even if you do everything right, and even if your developers do not introduce any avoidable vulnerabilities, new issues will probably be found as the threats evolve. To detect these, you need to regularly have security tests performed (both human and automated), keep up with the security state of the technologies in use, and have plans in place to deal with any new issues that are found.


It is not unusual to find a frightening degree of insecurity when you first bring your applications into the world of security testing, but diving back to hide is not prudent. Utilizing the right experience and tools can turn being afraid of your own shadow into being prepared for the changes to come. After all, if the cloud isn’t on the horizon for your company then you are probably already immersed in it.