BTreeSet causes UB by having (partially) dangling shared reference
RalfJung opened this issue · 25 comments
The following code causes UB (found by a patched miri enforcing validity invariants):
let mut b = std::collections::BTreeSet::new();
b.insert(42);
What happens is that BTreeSet::new
sets the root node to a pointer to a static EMPTY_ROOT_NODE
shared for this purpose by all B-trees that has type LeafNode<(), ()>
. That step is okay.
However, in the next step, insert
calls ensure_root_is_owned
which indirectly turns this ptr into &LeafNode<i32, ()>
. The trouble is that LeafNode<i32, ()>
is bigger than LeafNode<(), ()>
, so this reference is partially dangling: Its beginning points to allocated memory (the static EMPTY_ROOT_NODE
), but that allocated memory ends "too early". The invariant that &LeafNode<i32, ()>
be dereferencable for mem::size_of::<LeafNode<i32, ()>>()
bytes is violated. Since we are passing dereferencable(n)
to LLVM with n
set to the full size of the struct, this means we have UB not just according to our own validity rules, but according to LLVM's rules.
Good catch, and very interesting. I wasn't aware that this was a thing. When was this semantic decided/documented?
This can be easily fixed by hoisting the ptr::eq(self, &EMPTY_ROOT_NODE as *const _ as *const _)
into is_shared_root
.
Also: what value does that metadata have? Allowing speculative loads?
This can be easily fixed by hoisting the ptr::eq(self, &EMPTY_ROOT_NODE as *const _ as *const _) into is_shared_root.
I don't think that's enough. All the code that does read-only operations (get
, for example) will likely also turn the shared root into a reference.
what value does that metadata have? Allowing speculative loads?
Exactly.
Moreover, even just enforcing the noalias
annotations the way I proposed makes it UB to have dangling references.
@nikomatsakis points out there might also be alignment problems. @eddyb are there any types that have alignment bigger than pointers?
One possible fix -- but it'd be a big lang change -- is supporting generic statics (so that the static node can have a valid type). Of course that's not a "short term" usable thing.
Another thing @nikomatsakis suggested is to essentially use &LeafNode<(), ()>
throughout the read-only operations until we determined that the node actually is non-empty. That should also work.
I need to go over the code to check but I assume the “right” fix is to define an untyped base header type that LeafNode “inherits” from, just like we did in the Swift stdlib (where swift has TBAA)
Deferring working on this for a bit while some other stuff lands, some notes:
the core of the right fix is to add something like this code:
fn as_key_slice(&self) -> &'a [K] {
// When taking a pointer to the keys, if our key has a stricter
// alignment requirement than the shared root does, then the pointer
// would be out of bounds, which LLVM assumes will not happen. If the
// alignment is more strict, we need to make an empty slice that doesn't
// use an out of bounds pointer.
if mem::align_of::<K>() > mem::align_of::<LeafPrefix>() && self.is_shared_root() {
&[]
} else {
// Here either it's not the root, or the alignment is less strict,
// in which case the keys pointer will point "one-past-the-end" of
// the node, which is allowed by LLVM.
unsafe {
slice::from_raw_parts(
&(*self.node as *const LeafNode<K, V>).keys as *const K,
self.len()
)
}
}
}
to btree::node::Root
, and introduce an untyped header/prefix as I discussed above. The annoying part is going through all the code and making perfectly sure nothing casts to a Leaf prematurely.
the core of the right fix is to add something like this code:
I am not sure why comparing the alignments is enough to guarantee that things are in-bounds (I assume "the pointer" which is out of bounds is self.node
, but that's not clear either). And anyway wouldn't it be much simpler to just return &[]
whenever self.len() == 0
? Determining the len
should be possible even for the "untyped header".
The idea is we're trying to avoid any additional checks.
Alignment differences determine padding, which in turn determines if the keys would be placed within the bounds of the LeafPrefix's size.
Oh I see, the alignment checks would only involve constants and hence not be present at run-time... self.is_shared_root()
would still be a run-time check though?
Yeah it compares a dynamic pointer to a static value, but it's &&'d, so under reasonable conditions it will never run.
Is there a way to observe some kind of memory problem in practice because of this issue, such as an out-of-bounds read?
I'm asking because it would be an interesting test for https://github.com/blt/bughunt-rust - it would be very useful to see whether the testing setup detects this issue or not in order to gauge how it performs in practice.
No, I don't think LLVM exploits this UB and it doesn't by itself lead to bad memory accesses or such things.
are there any types that have alignment bigger than pointers?
I'm not sure I follow whether this is relevant, but this is the case for:
repr(align(N))
types whereN
is larger thanalign_of::<*const T>()
: for example, aContext
type containing the state of all registers can be created efficiently using something like_xsave
, but that requiresContext
to be 64-byte aligned,repr(simd)
types: in general, these have alignment requirements much larger than pointers, for example, up to 32-byte alignment on x86.
@gankro I have a fix for the out-of-bounds problem... but there's another one. ;) Consider into_slices_mut
:
fn into_val_slice_mut(mut self) -> &'a mut [V] {
debug_assert!(!self.is_shared_root());
unsafe {
slice::from_raw_parts_mut(
self.as_leaf_mut().vals.as_mut_ptr() as *mut V,
self.len()
)
}
}
fn into_slices_mut(self) -> (&'a mut [K], &'a mut [V]) {
let k = unsafe { ptr::read(&self) };
(k.into_key_slice_mut(), self.into_val_slice_mut())
}
I find the ptr::read
quite fascinating, this is duplicating a non-Copy
value. :D
But anyway, this is a blatant violation of mutable references being unique. Namely, when we call into_val_slice_mut
, we call as_leaf_mut()
, meaning we assert unique access to the entire leaf -- including the keys! Of course, this invalidates the mutable reference returned from into_key_slice_mut
.
Cases like this and the discussions we have been having around two-phase borrows make me wonder whether it is indeed the right call to consider creating a mutable reference as being roughly equivalent to a write access. The trouble is, if you don't do that, then when do you even assert uniqueness of mutable references? Only when they are written to? That can't be it, we couldn't say noalias
to LLVM with that. Passing a mutable reference to a function must also be something like a write. But then it seems really odd to special-case function calls like that, considering in particular that we want to solve #16515 some day.
The only reason we create a mutable reference here is because we lack ->
. The code is just trying to compute some pointer offsets as easily as possible :/
Ah right, using raw pointers likely fixes the problem.
@nikomatsakis @eddyb lack of auto-deref for raw pointers considered harmful...
I'd rather change the deref operator from prefix to postfix than make it implicit for raw pointers.
Yeah, a postfix deref operator would be nice. Currently, things get a bit ugly.
Couldn't we add some kind of method (inherently postfix) as an 80-20 solution? Or is the problem that we couldn't give it an appropriate type?
I don't think *
is an operation that a function can perform. *
does not mean memory actually gets accessed, as in &*
!
I don't think postfix *
works because ptr*.0
is ambiguous as either "mul by 0.0" or "deref to get field .0", would likely need ->
or something new.