Recently I was reading through the line-up for the Hack in the Box Conference which will be held in Malaysia this October. The following talk made my ears perk up: “Remote Code Execution Through Intel CPU Bugs” By Kris Kaspersky. Briefly, this talk will cover the exploitation of Intel processor errata. Yes you heard that right, Kris has managed to exploit hardware bugs. He goes on to say that they have developed PoC code which allows for remote exploitation of at least on of these bugs.
When I first came across this I was impressed to say the least. I decided to re-read the Intel Errata to see if I could spot the exploitable conditions. There was some discussion when these were first released, including speculation into the exploitability of several of these, but like most people I didn’t think much of it (OS developers flip out all the time).
After reading through the Intel Core 2 documents I decided to check out the revisions for the Athlon 64 as well. That’s where I ran across this gem:
Errata 95: “RET Instruction May Return To Incorrect EIP”
Speaking of exploitability… Lets see what causes this.
In order to efficiently predict return addresses, the processor implements a 12-deep return address stack to pair RETs with previous CALLs.
Under the following unusual and specific conditions, an overflowed hardware return stack may cause a RET instruction to return to an incorrect EIP in 64-bit systems running 32-bit compatibility mode applications:
• A CALL near is executed in 32-bit compatibility mode.
• Prior to a return from the called procedure, the processor is switched to 64-bit mode.
• In 64-bit mode, subsequent CALLs are nested 12 levels deep or greater.
• The lower 32 bits of the 64-bit return address for the 12th-most deeply nested CALL in 64-bit mode matches exactly the 32-bit return address of the original 32-bit mode CALL.
• A series of RETs is executed from the nested 64-bit procedures.
• The processor returns to 32-bit compatibility mode.
• A RET is executed from the originally called procedure.
So lets assume you have a 32 bit application running in compatibility mode that you would like to exploit and can force a somewhat long function to be repeatedly called. You could create a 64 bit thread with a very tight function that recursively calls itself 12 times and returns to a address which matches (lower 32 bits) the return of the function you are targeting.This would create a bit of a race, but it would be very winnable given a slightly complex target and a tight exploit loop.
Of course the errata doesn’t detail what the incorrect return address might be but assuming it can somehow be predicted or controled this could be a fun little bug. This specific bug only exists on a small subset of AMD hardware, specifically CPUIDs 0xF51, 0xF58, and 0xF48. If anyone has a processor with the bug and would like to experiment with it I would love to hear from you.