RazrFalcon/memmap2-rs

Alignment issue on Windows with empty file

Closed this issue ยท 12 comments

The Windows implementation has a problem when trying to create a mmap for an empty file.

Normally, Windows creates a mmap that is aligned in memory. But there's a catch. An empty file gets a pointer to address 1 (0x00000001), which is not aligned.

In Qdrant this was problematic, because we always expected an aligned mmap. We therefore handle this edge case, returning a dangling pointer that is aligned. This is fine because there's no data at the pointer, the byte slice is empty.

Can I submit a PR to implement a similar fix; return an aligned pointer on Windows? Or is the current behavior desirable, and should I not expect to get an aligned pointer?


I'm unsure why this happens on Windows, and couldn't find anything about this in the Windows API documentation. I assume it is an error code, where returning a null pointer is not an option so it picks the next address. This is very easily reproducible however.

This only happens on Windows. Unix-like platforms don't have this problem and return an aligned pointer.

use memmap2::Mmap;
use std::fs::File;

fn main() {
    let file = File::options()
        .read(true)
        .write(true)
        .create(true)
        .open("./empty-file.txt")
        .unwrap();
    file.set_len(0).unwrap();

    let mmap = unsafe { Mmap::map(&file).unwrap() };

    println!("Mmap pointer: {:?}", mmap.as_ptr());
}
# Windows
$ cargo run
Mmap pointer: 0x000000000001

# Linux, macOS
$ cargo run
Mmap pointer: 0x7f51653c1000

Related: qdrant/qdrant#1873
CC: @Jesse-Bakker

Can I submit a PR to implement a similar fix; return an aligned pointer on Windows? Or is the current behavior desirable, and should I not expect to get an aligned pointer?

Please do and include the above test case.

Please also have a look at windows::MmapInner::new which already seems to try to handle this situation but apparently does not do it correctly. I suspect it should check len instead of aligned_len?

This was originally introduced in 1183449 and I think the check was indeed incorrect from the beginning.

I think you're right. Good to see documentation linked there.

I'll submit a PR.

On second look, it is not the Windows API that returns (0x00000001), but the empty_slice_ptr() function:

fn empty_slice_ptr() -> *mut c_void {
    std::ptr::NonNull::<u8>::dangling().cast().as_ptr()
}

u8 is a single byte, the first aligned address after null is (0x00000001).

I do believe the empty file check is correct.

Technically, the returned pointer is aligned, because the Mmap deals with slices of bytes, which have alignment 1. However, since it's an empty slice, you might as well use a pointer with a larger alignment, preferably matching the alignment of a non-empty mmap.

I do believe the empty file check is correct.

Indeed, you are right and I was confused. So we make empty_slice_ptr return something with allocation_granularity as its alignment?

What about something like this (untested)?

fn empty_slice_ptr() -> *mut c_void {
    let align = allocation_granularity().max(1);
    let ptr = unsafe { std::ptr::null_mut::<u8>().add(align) };
    ptr.cast()
}
allocation_granularity.max(1) as *mut c_void

should do it

should do it (provided allocation_granularity is not 0)

I'm unsure if that is guaranteed.

I think for future compatibility with strict provenance we should use transmute instead of as *mut c_void, c.f. https://doc.rust-lang.org/stable/src/core/ptr/mod.rs.html#593

This then?

let align = allocation_granularity().max(1);
unsafe { std::mem::transmute(align) }

Thanks a lot for being so responsive allowing to resolve this so quickly! ๐Ÿ™