As part of an Intro to Security course I’m taking, my professor gave us a crackme style exercise to practice reading x86 assembly and basic reverse engineering.
The program is pretty simple. It accepts a password as an argument and we’re told that if the password is correct, "ok" is printed.
$ ./crackme
usage: ./crackme <secret>
$ ./crackme test
$
As usual, I start by running file
on the binary, which shows that it’s a
standard x64 ELF binary. file
also says that the binary is "not stripped", which means
that it includes symbols. All I really know about symbols are that they can
include debugging information about a binary like function and variable names
and some symbols aren’t really necessary; they can be stripped out to reduce
the binary’s size and make reverse engineering more challenging. Maybe I’ll
do a more in depth post on this in the future.
$ file crackme
crackme: ELF 64-bit LSB executable, x86-64, version 1 (SYSV), dynamically linked (uses shared libs), for GNU/Linux 2.6.32, BuildID[sha1]=0x3fcf895b7865cb6be6b934640d1519a1e6bd6d39, not stripped
Next, I run strings
, hoping to get lucky and find the password amongst the
strings in the binary. Strings looks for series of printable characters followed
by a NULL, but unfortunately nothing here works as the password.
$ strings crackme
/lib64/ld-linux-x86-64.so.2
exd4
libc.so.6
puts
printf
memcmp
__libc_start_main
__gmon_start__
GLIBC_2.2.5
fffff.
AWAVA
AUATL
[]A\A]A^A_
usage: %s <secret>
;*3$"
Since that didn’t work, we’re forced to disassemble the binary and
actually try to reverse engineer it.
We’ll start with main
.
$ gdb -batch -ex 'file crackme' -ex 'disas main'
Dump of assembler code for function main:
0x00000000004004a0 <+0>: sub rsp,0x8
0x00000000004004a4 <+4>: cmp edi,0x1
0x00000000004004a7 <+7>: jle 0x4004c7 <main+39>
0x00000000004004a9 <+9>: mov rdi,QWORD PTR [rsi+0x8]
0x00000000004004ad <+13>: call 0x4005e0 <verify_secret>
0x00000000004004b2 <+18>: test eax,eax
0x00000000004004b4 <+20>: je 0x4004c2 <main+34>
0x00000000004004b6 <+22>: mov edi,0x4006e8
0x00000000004004bb <+27>: call 0x400450 <puts@plt>
0x00000000004004c0 <+32>: xor eax,eax
0x00000000004004c2 <+34>: add rsp,0x8
0x00000000004004c6 <+38>: ret
0x00000000004004c7 <+39>: mov rsi,QWORD PTR [rsi]
0x00000000004004ca <+42>: mov edi,0x4006d4
0x00000000004004cf <+47>: xor eax,eax
0x00000000004004d1 <+49>: call 0x400460 <printf@plt>
0x00000000004004d6 <+54>: mov eax,0x1
0x00000000004004db <+59>: jmp 0x4004c2 <main+34>
End of assembler dump.
Let’s break this down a little.
0x00000000004004a0 <+0>: sub rsp,0x8
0x00000000004004a4 <+4>: cmp edi,0x1
0x00000000004004a7 <+7>: jle 0x4004c7 <main+39>
Starting at the beginning, we see the stack pointer decremented as part of the function prologue. The prologue is a set of setup steps involving saving the old frame’s base pointer on the stack, reassigning the base pointer to the current stack pointer, then subtracting the stack pointer a certain amount to make room on the stack for local variables, etc. We don’t see the former two steps because this is the main function so it doesn’t really have a function calling it, so saving/setting the base pointer isn’t necessary.
Then the edi
register is
compared to 1 and if it is less than or equal, we jump to offset 39.
0x00000000004004c2 <+34>: add rsp,0x8
0x00000000004004c6 <+38>: ret
0x00000000004004c7 <+39>: mov rsi,QWORD PTR [rsi]
0x00000000004004ca <+42>: mov edi,0x4006d4
0x00000000004004cf <+47>: xor eax,eax
0x00000000004004d1 <+49>: call 0x400460 <printf@plt>
0x00000000004004d6 <+54>: mov eax,0x1
0x00000000004004db <+59>: jmp 0x4004c2 <main+34>
Here at offset 39, we print something then jump to offset 34 where we repair the stack (undo the sub instruction from the prologue) and return (ending execution).
This is likely how the program checks the arguments and prints the usage message if no arguments are supplied (which would cause argc/edi to be 1).
However if we supply an argument, edi
is 0x2 and we move past the jle
instruction.
0x00000000004004a9 <+9>: mov rdi,QWORD PTR [rsi+0x8]
0x00000000004004ad <+13>: call 0x4005e0 <verify_secret>
Here we can see the verify_secret
function being called with a parameter
in rdi
. This is most likely the argument we passed into the program. We can
confirm this with gdb (I’m using it with peda here).
gdb-peda$ tele $rsi
0000| 0x7fffffffeb48 --> 0x7fffffffed6e ("/home/vagrant/crackme/crackme")
0008| 0x7fffffffeb50 --> 0x7fffffffed8c --> 0x4548530074736574 ('test')
0016| 0x7fffffffeb58 --> 0x0
Indeed rsi
points to the first element of argv
, so incrementing that by 8 bytes
(because 64 bit) points to argv[1]
, which is our input.
If we look after the verify_secret
call we can see the program checks
if eax
is 0 and if it is, jumps to offset 34, ending the program. However, if
eax
is not zero, we’ll hit a puts
call before exiting, which will presumably
print out the "ok" message we want.
0x00000000004004b2 <+18>: test eax,eax
0x00000000004004b4 <+20>: je 0x4004c2 <main+34>
0x00000000004004b6 <+22>: mov edi,0x4006e8
0x00000000004004bb <+27>: call 0x400450 <puts@plt>
0x00000000004004c0 <+32>: xor eax,eax
0x00000000004004c2 <+34>: add rsp,0x8
0x00000000004004c6 <+38>: ret
Now lets disassemble verify_secret
to see how the input validation is performed,
and to see how we can make it return non-zero.
Dump of assembler code for function verify_secret:
0x00000000004005e0 <+0>: sub rsp,0x408
0x00000000004005e7 <+7>: movzx eax,BYTE PTR [rdi]
0x00000000004005ea <+10>: mov rcx,rsp
0x00000000004005ed <+13>: test al,al
0x00000000004005ef <+15>: je 0x400622 <verify_secret+66>
0x00000000004005f1 <+17>: mov rdx,rsp
0x00000000004005f4 <+20>: jmp 0x400604 <verify_secret+36>
0x00000000004005f6 <+22>: nop WORD PTR cs:[rax+rax*1+0x0]
0x0000000000400600 <+32>: test al,al
0x0000000000400602 <+34>: je 0x400622 <verify_secret+66>
0x0000000000400604 <+36>: xor eax,0xfffffff7
0x0000000000400607 <+39>: lea rsi,[rsp+0x400]
0x000000000040060f <+47>: add rdx,0x1
0x0000000000400613 <+51>: mov BYTE PTR [rdx-0x1],al
0x0000000000400616 <+54>: add rdi,0x1
0x000000000040061a <+58>: movzx eax,BYTE PTR [rdi]
0x000000000040061d <+61>: cmp rdx,rsi
0x0000000000400620 <+64>: jb 0x400600 <verify_secret+32>
0x0000000000400622 <+66>: mov edx,0x18
0x0000000000400627 <+71>: mov esi,0x600a80
0x000000000040062c <+76>: mov rdi,rcx
0x000000000040062f <+79>: call 0x400480 <memcmp@plt>
0x0000000000400634 <+84>: test eax,eax
0x0000000000400636 <+86>: sete al
0x0000000000400639 <+89>: add rsp,0x408
0x0000000000400640 <+96>: movzx eax,al
0x0000000000400643 <+99>: ret
End of assembler dump.
I won’t walk through this one in detail because understanding each line
isn’t necessary to crack this. Let’s skip to
the memcmp call. If memcmp returns 0, eax
is set to 1 and the function
returns. This is exactly what we want. From the man page, memcmp
takes three
parameters, two buffers to compare and their lengths, and returns 0 if the
buffers are identical.
0x0000000000400622 <+66>: mov edx,0x18
0x0000000000400627 <+71>: mov esi,0x600a80
0x000000000040062c <+76>: mov rdi,rcx
0x000000000040062f <+79>: call 0x400480 <memcmp@plt>
Here’s the setup to the memcmp
call. We can see the third parameter for length
is the immediate 0x18 meaning the buffers will be 24 bytes in length. If we
examine address 0x600a80, we find this 24 byte string:
gdb-peda$ hexd 0x600a80 /2
0x00600a80 : 91 bf a4 85 85 c3 ba b9 9f a6 b6 b1 93 b9 83 8f ................
0x00600a90 : ae b1 ae c1 bc 80 ca ca 00 00 00 00 00 00 00 00 ................
Since this is a direct address to some memory, we can be fairly certain that
we’ve found some sort of secret value! Based on the movzx eax,BYTE PTR [rdi]
instruction (offset 7)
which moves a byte from the input string into eax, the xor eax, 0xfffffff7
instruction (offset 36), and
the add rdi, 0x1
instruction (offset 54) which increments the char*
pointer to our input string, we can reasonably guess
that this function is xor’ing each character of our input with 0xf7 and writing
the result into a buffer which begins at rsp
(also pointed to by rcx
). Since
we now know the secret (\x91\xbf\xa4\x85...
) and the xor key (0xf7
) it’s
pretty easy to extract the password we need by xor’ing each byte of the secret
with the xor key.
Here’s a way to do this with python.
{% highlight python %} str = ‘\x91\xbf\xa4\x85\x85\xc3\xba\xb9\x9f\xa6\xb6\xb1\x93\xb9\x83\x8f\xae\xb1\xae\xc1\xbc\x80\xca\xca’ ba = bytearray(str) for i, byte in enumerate(ba): ba[i] ^= 0xf7 print ba {% endhighlight %}
Which results in this:
$ python crack.py
fHSrr4MNhQAFdNtxYFY6Kw==
$ ./crackme fHSrr4MNhQAFdNtxYFY6Kw==
ok