RazrFalcon/memmap2-rs

Using MmapMut from multiple threads for simultaneous writing

nyurik opened this issue · 4 comments

I posted a stackoverflow question -- looking for some help on how to use this library to write to a file from multiple threads without any synchronization. Re-posting it here for visibility in case the author (thx!) or someone from the wonderful community can help.

I need to create a 40+ GB file using multiple threads. The file is used as a giant vector of u64 values. Threads do not need any kind of synchronization -- each thread's output will be unique to that thread, but each thread does NOT get its own slice. Rather, the nature of the data ensures each thread will generate a set of unique positions in the file to write to. Simple example -- each thread writes to a position [ind / thread_count], where ind goes to millions. For thread_count = 2, one thread writes to odd positions, and the other to even.

each thread's output will be unique to that thread, but each thread does NOT get its own slice.

If you can guarantee this within your application, you should be able to start from a single MmapRaw and hand out a raw pointer which is then dereferenced by each thread using unsafe code. I don't think that a pre-existing abstraction will capture the specific use case.

But essentially, if you start from a single MmapRaw or MmapMut, the synchronization task is exactly the same as if you start from a single &mut [u8] which is accessed by the threads in the manner you describe. (Basically, your application would not care whether it writes to memory buffer backed by anonymous memory handed out by the heap allocator or backed by a file and obtained directly using mmap.)

As a further illustration of the above even though it might not be helpful for your use case: As the byte slice provided by MmapMut is just that, it can be used with e.g. Rayon's ParallelSliceMut and slice::IterMut to operate on in parallel.

@adamreichold thank you for your quick reply!!! I'm still new to Rust (fun to learn it though!), so I might need a bit more guidance. Could you show a simple example (maybe even later add it to the docs?) that demonstrates how I can do this? Ideally this method should be usable from the rayon::iter::ParallelBridge -- par_map_reduce(). For a while I was considering to open the same file once from each thread with a localstore, but then it becomes a nightmare to manage them when used from a threadpool - each file has to be properly closed, which is not simple when the pool is managed by rayon. Thanks!!!

P.S. A user has offered a somewhat different example -- but i cannot evaluate if that approach has merit or not (see code)

The technique shown in https://stackoverflow.com/questions/33818141/how-do-i-pass-disjoint-slices-from-a-vector-to-different-threads is basically what I am suggesting. As written above, your could should start off with &mut [u8] and not care how that reference was produced.

Also as written above, you will probably need to pass *mut u8 and have each thread produce non-overlapping exclusive references via

fn some_thread(ptr: *mut u8, pos: usize) {
  // SAFETY: Only one thread will access the data at pos s.t. `val` will not alias any other references.
  let val =  unsafe { &mut *ptr.add(pos) };
  ...
}

if you cannot use an existing chunking mechanism like par_chunks or par_iter.

(I don't think AtomicU8 enters the picture if your threads never touch overlapping parts of the initial slice just as it does not enter if you do [u8]::split_at and pass the resulting subslices to two different threads.)