manticoreos/manticore

Page table isolation

Opened this issue · 2 comments

The kernel uses a traditional "higher-half kernel" approach, which means that every userspace process virtual address space has the kernel mapped in with appropriate protections. Unfortunately, the Meltdown attack is able to by-pass page table protections, breaking isolation and therefore allowing arbitrary reads from virtual memory.

Page table isolation is a technique to mitigate against Meltdown that maintains a separate page table for kernel and dropping kernel mapping from userspace process page tables. This has a significant cost for context switches (estimated at 5%-30% depending own workload) because page table needs to be switched.

@penberg: Can you add outlines here as well?

I have not analyzed what exactly is needed, but the short version is that we should switch between kernel and process page table on kernel entry and make sure kernel memory is not mapped to process page tables.

We currently construct the kernel page table (we call the MMU maps in the code) in init_mmu_map. We should store that in some global variable, and switch to that in syscall_entry upon entry and switch back to the process map on exit, for example. We should also update process_run to allocate a new MMU map that knows nothing of the kernel one.