[question] mmap write only
jeremy-coulon opened this issue · 1 comments
Hi,
I am currently using linux memory mapping for caching remote files. I would like to know if I can replace my implementation with LLFIO (and maybe have linux/windows portability).
Overview of my process:
- mmap a cache file. already available pages are readonly. missing pages are write only.
- when we try to read some write-only pages, catch the signal
- download the missing pages from elsewhere
- mark the page readonly and signal the application that pages are now available
Reading through documentation and code, it seems that llfio can't mmap a file in write only?
I am currently mmap'ing my file this way on Linux:
std::size_t size = ...
int fd = ... // ::open() or ::shm_open()
std::byte* mapping = reinterpret_cast<std::byte*>(::mmap(nullptr, size, PROT_WRITE, MAP_SHARED | MAP_NORESERVE, fd, 0));
Is it possible to have the same behavior with llfio?
Obviously I'd advise in the strongest possible terms against using signals to be notified when a page is not in cache. Use any other way except that. POSIX signals are fraught with nasty surprise. Also, doing network i/o in random 4Kb lumps is high latency compared to alternative approaches.
Setting that aside, yes you're right that mapping a file for write in LLFIO also maps for read. The reason is portability, Linux is unusual in supporting write-only maps. LLFIO also doesn't expose any way for changing the read/write status of individual pages, all you can do is commit and decommit them which is analogous to allocate/deallocate. This is because the C abstract machine has no concept of memory having readability/writability separate to C's static typing.
A portable design of your solution would need to be less Linux-specific. You also want to avoid the implicit kernel transition when a page fault happens, Linux has just about the quickest implementation out there, so on any other platform you'd see performance hurt.
You may be aware already that there was a commercial product in the 1990s implementing exactly what you have implemented, except for C++ objects. This is why the early STL implemented Allocators with an indirecting memory model, and Bjarne designed early C++ around a cache-on-demand memory model driven by page faults.