RazrFalcon/memmap2-rs

memmap2 `advise` is unsound

Closed this issue · 8 comments

I think this only applies to the read-only version, but advise can conceptually performs a write and should require mutable access. Here's a simple reproducer.

use memmap2::*;

fn main() {
    let mut a = MmapMut::map_anon(4096).unwrap();
    a.as_mut().fill(255);
    let a = a.make_read_only().unwrap();

    bar(&a, a.as_ref());
}

fn bar(map: &Mmap, slic: &[u8]) {
    let a = slic[0];
    map.advise(Advice::DontNeed);
    let b = slic[0];

    println!("{} {}", a, b);
}

In debug, this prints out 255 0, as DontNeed has freed the pages. In release, this prints out 255 255. Needlessly to say, this program has UB because it changes the memory an immutable reference points to while it is active.

Great find!

If I understand this correctly, then changing Mmap*::advise(_range) to take &mut self instead of &self should fix this as it would prevent any borrows of the data itself via Deref?

Finally, I wonder whether we want to split the methods/arguments for those which do require exclusive access due to this issue and those which definitely do not. That's probably quite complicated, but it would avoid losing functionality could be made available soundly via shared access.

Ah one additionally problem I see is that the effect of calling "destructive" madvise variants is not necessarily immediate, e.g. from MADV_FREE:

The kernel can thus free these pages, but the freeing could be delayed until memory pressure occurs.

So the above could still happen while a borrow is held (after the borrow used to call advise is already released) and from the point of view of the application, this is somewhat non-deterministic. Not sure if we can do anything about that besides reducing this to an unsafe API...

I think this only applies to the read-only version

By the way, I don't think this is the case: The data backing an exclusive reference (i.e. having noalias semanitcs) changing behind the compiler's back (i.e. not through that reference or those derived from it) is just as bad.

So from a first pass, the following flags make observable changes to the mapped pages: MADV_DONTNEED, MADV_FREE, MADV_REMOVE, MADV_FREE_REUSABLE and MADV_FREE_REUSE.

I am somewhat unsure on how to handle MADV_DONTFORK which means accessing those pages in the child trigger a segmentation fault. Is this necessarily unsafe?

I think we should simply mark advice as unsafe. This is the simplest solution.
Someone can try writing a safe API on top of it, wrapping raw MADV_* flags and mapping their behavior to Rust's type system. But it seems like a lot of work.

Yeah, I think just marking advise as unsafe is the best way of handling it, unless one wants to expose some subset for safe use.

Marking the whole function as unsafe feels like a cop-out as most of the flags do not have the issue.

How about not using an enum for the flags but rather constructor methods returning opaque instances of e.g. struct Advice. The constructors for the not-completely-obviously-safe variants become unsafe functions but the method itself stays safe.

This is more involved than just marking the method as unsafe, but it is also less complicated than having separate safe and unsafe API.

I would be willing to do the leg work of implementing it. (And I think we need a breaking release either way.)

@adamreichold If you're interested in working on it - sure. I can only mark it as unsafe.

I have no idea how advice works. By reading the docs it's absurdly convoluted and asynchronous. Not sure if there is a way to make a decent API around it.
For example, something like MADV_FREE could be implemented by taking self, just like drop. But you technically still can write to the memory and the new data will be preserved. I have no idea how to emulate this via Rust type system.