userfaultfd async wp mode in page guard manager
ishitatsuyuki opened this issue · 7 comments
Congratulations on shipping userfaultfd support in the page guard manager!
It looks like overhead was noted as a concern when using the userfaultfd option. I would like to point out that a recently added "Async Write-Protect" mode for userfaultfd is intended to address change-tracking use cases like this.
The feature is documented at https://docs.kernel.org/admin-guide/mm/userfaultfd.html#write-protect-notifications, and is used in Valve's Wine fork (https://github.com/ValveSoftware/wine/blob/bleeding-edge/dlls/ntdll/unix/virtual.c) to emulate Win32's write watches. What gfxr is doing should be similar to write watches, so perhaps the Wine source can be used as a reference when implementing async wp support.
Thank you for taking the time to suggest possible improvements.
From what I understand you are referring to UFFDIO_WRITEPROTECT_MODE_WP
UFFD_FEATURE_PAGEFAULT_FLAG_WP
feature flag. We are aware of that but there are some limitations with it and it turns out that it's not so useful for our case.
The first reason is that this feature is relatively new and is not broadly supported by all kernels that exist now on Android phones (at least not on every phone that we have tested internally).
The second and more important reason is that it doesn't do exactly what we need.
From what I understand UFFDIO_WRITEPROTECT_MODE_WP
should enable to track writes even to allocated/existing pages. Unfortunately this is not enough in our case as we also want to track reads, not only writes.
I will take a look at the source file you pointed. Perhaps there is something clever there that we can also implement.
You are right that async WP doesn't provide any support for detecting read-only page faults.
My understanding is that async WP should eliminate the need for shadow memory and hence it will be unnecessary to track read-only faults, like how the D3D12 backend works when using GetWriteWatch. I haven't looked at the implementation yet, but please let me know in case the Vulkan backend works in some different way such that skipping tracking reads would not be viable.
To be honest we haven't tried skipping shadow memory completely with userfaultfd (or maybe we did but that was some time ago and I can't remember exactly).
We have tried that with the mprotect
mechanism (apply the mprotect
+ SIGSEGV
trick directly on the mapped memory returned by the driver) but it turned out that it didn't work out well. IIRC there were random crashes.
I guess it wouldn't hurt to try it with userfaultfd.
I am afraid that this is not possible. If I understand the uffd documentation correctly, both UFFDIO_REGISTER_MODE_MISSING
and UFFDIO_REGISTER_MODE_WP
must be applied on private anonymous memory regions. The memory regions returned from the driver are not expected to fulfill this criteria.
I did some very brief tests by not allocating a shadow memory and registering the mapped memory directly to uffd and it seems that the registration fails:
E gfxrecon: ioctl/uffdio_register: Invalid argument
E gfxrecon: uffdio_register.range.start: 0x7700769000
E gfxrecon: uffdio_register.range.len: 524288
Checking the /proc/pid/maps
file for this region:
7700769000-77007e9000 rw-s 10e10e000 00:10 950 /dev/dri/renderD128
This is unfortunately poorly documented but any memory region is allowed if you use ASYNC WP. Could this help?
Interesting. This doesn't agree with the documentation found online. I only tried applying UFFDIO_REGISTER_MODE_MISSING
that is currently used. I'll look into it a bit more
No it's still failing. I'm initializing with UFFD_FEATURE_PAGEFAULT_FLAG_WP
, registering only with UFFDIO_REGISTER_MODE_WP
and registrations fail with EINVAL
. Tried this on both android and desktop. Maybe I am missing something? Have you ever done something similar?
Edit:
It looks like this also requires UFFD_FEATURE_WP_ASYNC
not just UFFD_FEATURE_PAGEFAULT_FLAG_WP
. I can't find this flag even in the userfaultfd.h on ubuntu with kernel 6.5. This looks a cutting edge feature introduced in kernel 6.7 not broadly available even on desktop.
Edit 2:
Note for future reference:
UFFD_FEATURE_WP_ASYNC
solves the problem in an asynchronous manner: The faults are not sent to the application via messages over the created fd in order to be resolved but Instead they are handled by the kernel. The user app can collect the “written/dirty” status by looking up the uffd-wp bit for the pages being interested in /proc/pagemap. This basically boils down into two things:
- This essentially becomes an alternative for the dirty bit in the page table entries which is not functioning properly on Android.
- It will require some re-writing as it will not need to listen for messages but will have to parse
/proc/pagemap
instead.
Possible problems:
Multi threaded applications might still cause problems while GFXR is parsing /proc/pagemap
to detect dirty pages and at the same time threads are touching these pages.
Since there will be no need for shadow memory it looks promising as a better and faster alternative for memory tracking on Linux and Android. But this should wait until 6.7 becomes a standard for Android kernels.