do not rely on `VolatileCell` for memory safety or to prevent reordering of memory operations
japaric opened this issue · 3 comments
the implementation is using volatile operations to prevent re-ordering between memory operations but as used this is wrong for two different reasons
- volatile operations are not reordered (by the compiler) wrt to other volatile operations, but the compiler is free to move non-volatile operations around
For example, RttHeader::init
contains this code:
ptr::write_volatile(&mut self.max_down_channels, max_down_channels);
ptr::copy_nonoverlapping(b"SEGG_" as *const u8, self.id.as_mut_ptr(), 5);
// ..
the second ptr
call is a non-volatile operation so the compiler is free to move that operation before the first volatile ptr
call. Whether you observe that reordering today or not is an implementation detail but any LLVM upgrade could cause these two operations to be ordered differently
The correct abstraction to prevent both compiler-level and architecture-level reordering are atomic fences, either use atomic::fence
or use Atomic*.{load,store,etc}
-- the latter contain implicit atomic fences. How the different Ordering
s prevent reordering is covered here.
- volatile operations prevent compiler reordering but the memory operations can be reordered at the architecture level: the processor may execute instructions out of order or the memory bus may commit writes in an order that doesn't match the source code; cache implementations may also be relevant here.
Again, the correct way to prevent re-ordering of memory operations are atomic fences. Using atomic fences will make this crate portable to e.g. (single core) ARM Cortex-A and more complex architectures.
cc @Yatekii
Thanks a lot for reporting this too!
I didn't even know that the core will possibly reorder the memory operations! Thanks for telling me.
I suppose an acceptable course of action would be to get rid of the volatiles and then just pepper fence(SeqCst)
in some critical spots (e.g. writing the header, and also when dealing with the buffer data) to enforce correct ordering between the core and debugger (which is a somewhat weird situation to be dealing with anyways). I doubt it would have any sort of performance impact in practice either.
Edit:
On second thought, fences don't prevent the compiler from completely removing operations though if it looks like the results are not being used (which they do), do they? So all writes still have to be volatile.
I don't suppose there's a volatile "memcpy" anywhere? I really would rather not implement an efficient memcpy by hand.
Does this commit look any better? I used atomics for everything that changes on the fly, and volatile writes for all the initialization stuff.