BTreeSet causes UB by having (partially) dangling shared reference

Question

BTreeSet causes UB by having (partially) dangling shared reference

RalfJung opened this issue 6 years ago · 25 comments

The following code causes UB (found by a patched miri enforcing validity invariants):

    let mut b = std::collections::BTreeSet::new();
    b.insert(42);

What happens is that BTreeSet::new sets the root node to a pointer to a static EMPTY_ROOT_NODE shared for this purpose by all B-trees that has type LeafNode<(), ()>. That step is okay.

However, in the next step, insert calls ensure_root_is_owned which indirectly turns this ptr into &LeafNode<i32, ()>. The trouble is that LeafNode<i32, ()> is bigger than LeafNode<(), ()>, so this reference is partially dangling: Its beginning points to allocated memory (the static EMPTY_ROOT_NODE), but that allocated memory ends "too early". The invariant that &LeafNode<i32, ()> be dereferencable for mem::size_of::<LeafNode<i32, ()>>() bytes is violated. Since we are passing dereferencable(n) to LLVM with n set to the full size of the struct, this means we have UB not just according to our own validity rules, but according to LLVM's rules.

Answer 1 · 2018-10-10T16:15:34.000Z

cc @gankro

Answer 2 · 2018-10-10T21:56:04.000Z

Good catch, and very interesting. I wasn't aware that this was a thing. When was this semantic decided/documented?

This can be easily fixed by hoisting the ptr::eq(self, &EMPTY_ROOT_NODE as *const _ as *const _) into is_shared_root.

Answer 3 · 2018-10-10T21:57:15.000Z

Also: what value does that metadata have? Allowing speculative loads?

Answer 4 · 2018-10-11T07:26:16.000Z

This can be easily fixed by hoisting the ptr::eq(self, &EMPTY_ROOT_NODE as *const _ as *const _) into is_shared_root.

I don't think that's enough. All the code that does read-only operations (get, for example) will likely also turn the shared root into a reference.

what value does that metadata have? Allowing speculative loads?

Exactly.

Moreover, even just enforcing the noalias annotations the way I proposed makes it UB to have dangling references.

Answer 5 · 2018-10-11T13:50:23.000Z

@nikomatsakis points out there might also be alignment problems. @eddyb are there any types that have alignment bigger than pointers?

Answer 6 · 2018-10-11T13:57:06.000Z

One possible fix -- but it'd be a big lang change -- is supporting generic statics (so that the static node can have a valid type). Of course that's not a "short term" usable thing.

Answer 7 · 2018-10-11T14:01:15.000Z

Another thing @nikomatsakis suggested is to essentially use &LeafNode<(), ()> throughout the read-only operations until we determined that the node actually is non-empty. That should also work.

Answer 8 · 2018-10-11T14:23:51.000Z

I need to go over the code to check but I assume the “right” fix is to define an untyped base header type that LeafNode “inherits” from, just like we did in the Swift stdlib (where swift has TBAA)

Answer 9 · 2018-10-12T14:17:23.000Z

Deferring working on this for a bit while some other stuff lands, some notes:

the core of the right fix is to add something like this code:

    fn as_key_slice(&self) -> &'a [K] {
        // When taking a pointer to the keys, if our key has a stricter
        // alignment requirement than the shared root does, then the pointer
        // would be out of bounds, which LLVM assumes will not happen. If the
        // alignment is more strict, we need to make an empty slice that doesn't
        // use an out of bounds pointer.
        if mem::align_of::<K>() > mem::align_of::<LeafPrefix>() && self.is_shared_root() {
            &[]
        } else {
            // Here either it's not the root, or the alignment is less strict,
            // in which case the keys pointer will point "one-past-the-end" of
            // the node, which is allowed by LLVM.
            unsafe {
                slice::from_raw_parts(
                    &(*self.node as *const LeafNode<K, V>).keys as *const K,
                    self.len()
                )
            }
        }
    }

to btree::node::Root, and introduce an untyped header/prefix as I discussed above. The annoying part is going through all the code and making perfectly sure nothing casts to a Leaf prematurely.

Answer 10 · 2018-10-13T07:31:18.000Z

the core of the right fix is to add something like this code:

I am not sure why comparing the alignments is enough to guarantee that things are in-bounds (I assume "the pointer" which is out of bounds is self.node, but that's not clear either). And anyway wouldn't it be much simpler to just return &[] whenever self.len() == 0? Determining the len should be possible even for the "untyped header".

Answer 11 · 2018-10-14T12:17:00.000Z

The idea is we're trying to avoid any additional checks.

Alignment differences determine padding, which in turn determines if the keys would be placed within the bounds of the LeafPrefix's size.

Answer 12 · 2018-10-14T15:19:17.000Z

Oh I see, the alignment checks would only involve constants and hence not be present at run-time... self.is_shared_root() would still be a run-time check though?

Answer 13 · 2018-10-14T17:06:05.000Z

Yeah it compares a dynamic pointer to a static value, but it's &&'d, so under reasonable conditions it will never run.

Answer 14 · 2018-10-24T18:17:14.000Z

Is there a way to observe some kind of memory problem in practice because of this issue, such as an out-of-bounds read?

I'm asking because it would be an interesting test for https://github.com/blt/bughunt-rust - it would be very useful to see whether the testing setup detects this issue or not in order to gauge how it performs in practice.

Answer 15 · 2018-10-24T18:22:28.000Z

No, I don't think LLVM exploits this UB and it doesn't by itself lead to bad memory accesses or such things.

Answer 16 · 2018-11-26T16:17:03.000Z

are there any types that have alignment bigger than pointers?

I'm not sure I follow whether this is relevant, but this is the case for:

repr(align(N)) types where N is larger than align_of::<*const T>(): for example, a Context type containing the state of all registers can be created efficiently using something like _xsave, but that requires Context to be 64-byte aligned,
repr(simd) types: in general, these have alignment requirements much larger than pointers, for example, up to 32-byte alignment on x86.

Answer 17 · 2018-12-07T22:35:58.000Z

@gankro I have a fix for the out-of-bounds problem... but there's another one. ;) Consider into_slices_mut:

    fn into_val_slice_mut(mut self) -> &'a mut [V] {
        debug_assert!(!self.is_shared_root());
        unsafe {
            slice::from_raw_parts_mut(
                self.as_leaf_mut().vals.as_mut_ptr() as *mut V,
                self.len()
            )
        }
    }

    fn into_slices_mut(self) -> (&'a mut [K], &'a mut [V]) {
        let k = unsafe { ptr::read(&self) };
        (k.into_key_slice_mut(), self.into_val_slice_mut())
    }

I find the ptr::read quite fascinating, this is duplicating a non-Copy value. :D

But anyway, this is a blatant violation of mutable references being unique. Namely, when we call into_val_slice_mut, we call as_leaf_mut(), meaning we assert unique access to the entire leaf -- including the keys! Of course, this invalidates the mutable reference returned from into_key_slice_mut.

Cases like this and the discussions we have been having around two-phase borrows make me wonder whether it is indeed the right call to consider creating a mutable reference as being roughly equivalent to a write access. The trouble is, if you don't do that, then when do you even assert uniqueness of mutable references? Only when they are written to? That can't be it, we couldn't say noalias to LLVM with that. Passing a mutable reference to a function must also be something like a write. But then it seems really odd to special-case function calls like that, considering in particular that we want to solve #16515 some day.

Answer 18 · 2018-12-07T23:04:59.000Z

The only reason we create a mutable reference here is because we lack ->. The code is just trying to compute some pointer offsets as easily as possible :/

Answer 19 · 2018-12-08T22:30:30.000Z

Ah right, using raw pointers likely fixes the problem.

@nikomatsakis @eddyb lack of auto-deref for raw pointers considered harmful...

Answer 20 · 2018-12-09T08:29:52.000Z

I'd rather change the deref operator from prefix to postfix than make it implicit for raw pointers.

Answer 21 · 2018-12-09T13:01:50.000Z

Yeah, a postfix deref operator would be nice. Currently, things get a bit ugly.

Answer 22 · 2018-12-09T13:53:41.000Z

Couldn't we add some kind of method (inherently postfix) as an 80-20 solution? Or is the problem that we couldn't give it an appropriate type?

Answer 23 · 2018-12-09T16:12:10.000Z

I don't think * is an operation that a function can perform. * does not mean memory actually gets accessed, as in &*!

Answer 24 · 2018-12-09T16:35:14.000Z

I don't think postfix * works because ptr*.0 is ambiguous as either "mul by 0.0" or "deref to get field .0", would likely need -> or something new.

Answer 25 · 2018-12-09T18:10:19.000Z

I submitted a proposed fix for this bug at #56648