RazrFalcon/memmap2-rs

Composite memory maps currently impossible

Closed this issue · 7 comments

I want to create a thread-local memory map (could be file-based or anonymous, I'd prefer anonymous) in each of N threads where the first M bytes of each thread-local mapping are mapped to the same file (which is itself mapped into memory). I know this allocation scheme is possible using plain mmap in Linux but I can't find a way to do it in memmap2-rs. That is, there is no way to map the buffer of a MMapMut into the buffer of another MMap* struct.

Am I wrong to believe this or is it truly impossible under the current API?

Can you please illustrate the C or at least plain mmap call sequence you would use?

From your description, I wonder why you just don't pass the byte slice itself to the other threads? And what kind of synchronization exists between them as multiple threads writing into the same mapping without synchronization is likely to imply a data race.

(It is also not clear to me what "thread-local memory map" means here as a memory mapping is by design process-wide as all threads share a single address space in which all memory maps are contained.)

Here's some code from the slice_deque crate that calls to the POSIX mmap:

let ptr = mmap(
            ptr::null_mut(),
            size,
            PROT_READ | PROT_WRITE,
            MAP_SHARED,
            fd,
            0,
        );
        if ptr == MAP_FAILED {
            print_error("@first: mmap failed");
            if close(fd) == -1 {
                print_error("@first: close failed");
            }
            return Err(AllocError::Oom);
        }

        let ptr2 = mmap(
            (ptr as *mut u8).offset(half_size as isize) as *mut c_void,
            half_size,
            PROT_READ | PROT_WRITE,
            MAP_SHARED | MAP_FIXED,
            fd,
            0,
        );

In ptr2, a second map is created by offsetting into the first map at ptr.

The reason for this is because I'm implementing a concurrent Prolog runtime in Rust with a consolidated heap. The first section of the heap (the memory map shared among all threads) is the atom table. The second section that comes immediately after it is the Prolog heap. Cells in the Prolog heap compose Prolog terms. Prolog terms can point down into the atom table.

From your description, I wonder why you just don't pass the byte slice itself to the other threads?

Because of the Prolog VM design, it's most convenient to map the atom table contiguously in memory next to the thread-local Prolog heap. In particular, converting atoms to strings (lists of characters) is very efficient this way.

(It is also not clear to me what "thread-local memory map" means here as a memory mapping is by design process-wide as all threads share a single address space in which all memory maps are contained.)

By that I mean that every VM thread will have its own local MMapMut object that will never traverse threads but that contains the atom table MMap* as a sub-map.

And what kind of synchronization exists between them as multiple threads writing into the same mapping without synchronization is likely to imply a data race.

Absolutely, I plan to use RCU to manage that. But that's outside the scope of the memmap-rs library.

Also, I'm aware that memory mapping is process-wide and not per-thread. I mention threads just to give context to my use case. I don't expect memmap-rs to be aware of threading and synchronization.

So this appears to be a duplicate of #35, i.e. support for MAP_FIXED.

To be honest, using MAP_FIXED safely appears even harder to just plain non-overlapping mappings and it does not appear to be supported on Windows, so I am unsure whether this crate will be particularly helpful compared to lower-level wrappers like libc, nix or rustix.

Finally, but that is besides questions on this crate's API, why could the terms not just point into a global atom table wherever that is allocated, i.e. why must the two sections of the heap be contiguous?

Heap cells are tagged 64-bit pointers, so their addresses are relative rather than absolute. There would then need to be further tags to distinguish the locations, which would complicate the VM design.

If PRs are welcome for this, I'll try to implement MAP_FIXED. I need it to work on Windows too, so I'll be looking at that as well.

You are right, fixing memory maps is not possible in Windows. Closing this issue.