An Introduction to ARM exploitation


Part 1: Getting set up


This tutorial will cover the basics of exploit development on the ARM platform. This is aimed at beginners interested in exploit development and "hacking" so you shouldn't need too much knowledge on this subject before reading. You should, however, have a basic understanding of how a CPU works, binary, hex, what assembly code is and how to program in C.


To follow along with this tutorial you will need a 32 bit ARM device. As I focus most of my work on iPhones and iOS, I will use an iPhone 4 for this tutorial. This device has the Apple A4 chip (S5L8930X) which is an ARMv7 (32 bit) processor. You can use any other 32 bit iOS device for this tutorial but the iPhone 4 is probably the best device as it's latest supported version is 7.1.2 which is publicly jailbreakable whereas other 32 bit devices' latest versions may not be publicly jailbreakable and so you may have to wait for a new jailbreak release before being able to follow along with this tutorial.


If you have a 64 bit (ARM64) device, you can still attempt to follow along with this but there will be some complications so you may have difficulty following it exactly.


To get started you'll need to install a few tools that'll allow you to compile C code on-device and debug it. Add http://cydia.radare.org to your Cydia sources and install radare2-arm32. This will install the radare debugger onto your device which will be useful later for analysing code and disassembling programs to understand how they work.


Also add http://coolstar.org/publicrepo and install LLVM+Clang. This will allow you to compile C programs on-device without the assistance of a computer. You'll also need an SDK (Software Development Kit). I'm currently using the iPhoneOS8.1.sdk on my device. You can download these online or move them over to your device from Xcode on your Mac.

The other 2 essential things are iFile and an iOS terminal emulator of your choice (in my case, MTerminal). I assume the majority of people will already have these 2 applications installed.



Part 2: Creating a vulnerable program


To get started, we first need a program with a vulnerability that we can manipulate and use to our advantage. This is known as exploitation. Exploitation is often used to take control or hi-jack the execution flow of a process in order to make it do something that it normally shouldn't. In iOS jailbreaking, kernel exploits are used to remount the iOS filesystem as R/W, allow unsigned binaries to be executed on the device and much more. The exploit shown in this tutorial will obviously be very basic and much much simpler than anything used in real world situations like iOS jailbreaking etc. However, the example I will use teaches the foundation of exploit development and therefore should be a good place for you to start if you are interested.


The source code of the program we will be exploiting is down below.


#include 
#include 
#include 
#include 

void secret(){

printf("You shouldn't be here ;P\n");
system("ls -la /");


}

int main(){

char buff[12];
gets(buff);

printf("You entered %s\n",buff);

return 0;

}

In iFile (or Terminal if you prefer) create a file called "hello.c" and paste the above code inside.


Hopefully you have a basic understanding of C. If not, I'll breifly explain what this program does.


As you can see, this program has only 2 functions, main() and secret(). The secret() function is never actually called (never used) by the program. We'll get back to that in a minute.


The main() function is the only function executed in this program. As you can see it first declares a buffer of 12 characters named "buff". This is essentially an array of characters or a string that is designed to hold 12 characters.


The next line uses the gets() function from libc to take an input from the user and place this input into the 12 character buffer.


The rest of this function isn't too important. All it does is print back what the user supplied and then returns 0.


To compile and test this program we need to use clang. In MTerminal on the device, go to the directory your have your hello.c in by typing "cd /path/to/directory". Then type "clang hello.c -isysroot /path/to/sdk -fno-pie -fno-stack-protector -mno-thumb -o hello". This command will compile your source code into an executable program. The "-fno-pie -fno-stack-protector" disable some security features that would make this tutorial more challenging. The "-mno-thumb" ensures that this program will execute in ARM mode, not THUMB mode. You can type "ls" to list all the files in the current directory and you should see the executable file named "hello".


Part 3: Debugging and Exploitation


To run this executable, type the path to this file. Instead of typing the full path (such as /var/mobile/....) you can type ./ as we are already in the same directory as the executable. So type "./hello" and the program should run.

You can see that before the program even starts, it warns us that it is vulnerable as it uses the gets() function. This is a very dangerous function as there is no input validation so it will take whatever size input you supply it and will attempt to stuff all that data into the 12 character buffer. Obviously if you enter 50 characters, they cannot all fit into the 12 character buffer. The remaining characters will end up being written onto the stack (an area of the process memory) but will likely overwrite other information already stored on the stack such as return addresses and other important values. This type of vulnerbility is known as a stack buffer overflow.


In the screenshot you can see that I entered "heyyyyyy" into the program (which is less than 12 chars so fits nicely into the buffer) and the program prints back out that string. It then exits normally. Now I will enter an input that contains over 12 characters to see what happens.

Now we see something different. Instead of the program exiting normally, we get Segmentation fault: 11. This means that the program crashed (probably because we entered so many characters). To manipulate this crash and write an exploit for this simple app, we need a way to anaylse the crash. You can use CrashReporter from Cydia for this.


Crash reports will provide you alot of information. We only need to focus on the register section. This shows the values each of the registers were holding right before the program crashed. On ARM there are 15 registers (actually there are more but we only need to think about these 15). R0-R12 are general purpose registers. R13-R15 are special registers. R13 is the Stack Pointer, R14 is the Link Register and R15 is the Program Counter (pc). The pc stores the address in memory of the next instruction to be executed by the CPU. Usually, this would increment so that the CPU would execute instructions in order.

In the crash report above, you can see the value that the Program Counter contains at the time of the crash. That value is 0x41414140. 0x41 is the hex value for "A", meaning that the Program Counter contains "AAAA". Since the address 0x41414140 is invalid memory, when the program attempts to jump to this location and continue executing, it crashes because there are no instructions at this address. In the eyes of an attacker, this crash is a great sign and the analysis of the crash is even better because it proves that we have full control of the Program Counter register. This is due to the fact that the return address for the main() function is stored somewhere on the stack and when we write too many characters (in this case a bunch of "A"s) they exceed the boundaries of the 12 character buffer and begin overwriting important information on the stack. As we can see from the crash report, we overwrote the return address for the main() function. This means that we can place any value we like into the Program Counter register and the program will happily jump to it.


If we place a valid address in memory instead of 0x41414140 the program will jump to this address and continue executing. This would result in us having hi-jacked the program execution flow meaning we can make it do whatever we want.


If you remember earlier, there is a function called secret() in this program that contains some code but is never actually used anywhere in the program. Using this buffer overflow vulnerability, we should be able to redirect execution flow to this function. To do this we need to disassemble the program to find the address of this function.


Back in MTerminal, type "r2 hello" to open the "hello" program in radare2. Then type "aa" to automatically analyse the program. Type "afl" to list all of the functions found inside this program.

You can see amongst the list of functions, the secret function is there along with it's start address to the left. This 0x0000bee4 address is the address that we want the program to jump to when we overwrite the return address on the stack. 0x0000bee4 is a location in memory that contains the first instruction of the secret function and so jumping to it effectively calls that function.


If you want to analyse the secret() function in more detail, you can disassemble is by typing "s sym._secret" and then "pdf" to print the function disassembly.

The above screenshot shows the ARM assembly code for the secret() function. For this simple exploit tutorial, you do not need to understand ARM assembly but it is a good thing to learn if you want to move on to more advanced exploits involving Return Oriented Programming in the future.


Anyway, all we need to know for this exploit is the address of the first instruction in this function which can also be seen in the assembly output - 0x0000bee4. Our goal is to craft an input string that will overflow the buffer and overwrite the return address on the stack with 0x0000bee4. When this value is popped back into the Program Counter, the program will jump to the secret function.


To do this, we need to know exactly where in the string of "A"s the return address is overwritten. We can find this easily by using a pattern as the input. If we type "AAAABBBBCCCCDDDDEEEEFFFFGGGGHHHHIIIIJJJJ" into the program, when we examine the crash log we will easily be able to see which 4 ASCII characters the Program Counter contains as the values will be different.

After crashing the program with this string you can see that the Program Counter contained the value 0x46464644 which is the hex equivilent of "FFFF". This means that the position of the "FFFF" in the above string is the position we need to place the address of the secret() function.


To actually craft the exploit string we need to enter hex numbers directly, not only ASCII characters. To do this we can use printf and pipe the output into the "hello" program. This is done by typing "printf" followed by a string inside speech marks. Then we can use the "|" character followed by the program name in order to supply the output of printf as the input to the program.


To type a hex number by using printf, you can type "\x" followed by the hex byte. The "\x" will not be counted as part of the string but will interpret the 2 digits after it as a hex number. This is how we will type out the address of the secret() function. However, we must enter the address in reverse order due to the ARMv7 processor being Little Endian. This simply means that it reads each byte in reverse. So the address 0x0000bee4 would be typed as 0xe4be0000.


Our final exploit string should look something like this - "AAAABBBBCCCCDDDDEEEE\xe4\xbe\x00\x00".


As you can see in the screenshot above, we can use printf to supply our exploit string to the program. The gets() function within the program then attempts to write all of the data in the string into the 12 character buffer but there is not enough space. The remaining characters are then written onto the stack, overwriting already stored information including the return address of this function. The hex value 0x0000bee4 happens to be written to where the return address was previously causing the program to jump to this address when the function returns.


You can see that instead of a Segmentation fault or another kind of crash, this time the program prints "You shouldn't be here ;P" and then lists the contents of the / directory on the device. If we look back up at the source code for the program we can see that this is exactly what the secret() function does. Exploit complete!


Part 4: Conclusion


Hopefully you found this tutorial useful and gave you an idea of how software exploitation works if you are new. The secret() function in this tutorial program represents desireable code for an attacker. In a real world situation, this function could spawn a root shell and give you full access to a device. Alternatively, it could be an internal function that gives you extra features in a program that you shouldn't normally be able to access. Obviously in real world situation it would not be as simple as returning to a single function. Return Oriented Programming (ROP) is a method of chaining together small parts of many different functions in order to carry out a task. I plan to cover this in later tutorials.


Thanks for reading this tutorial! If you have any questions then tweet me @bellis1000