VecDeque's Drain::drop writes to memory that a shared reference points to
RalfJung opened this issue · 21 comments
liballoc's test_drain fails when run in Miri. The error occurs in the drain(...).collect() call:
collectis called with avec_deque::Drain<usize>as argument.DraincontainsItercontains a shared reference to a slice; that slice is thus marked as "must not be mutated for the entire duration of this function call".collectcallsfrom_itercallsextendcallsfor_eachcallsfold, which eventually drops theDrain.Drain::dropcallssource_deque.wrap_copyto re-arrange stuff (I have not fully understood this yet), and in some cases this will end up writing to memory that the slice inIterpoints to.
I am not sure what the best way to fix this is. We have to fix Drain holding (indirectly) a shared reference to something that it'll mutate during its Drop. The mutation is likely there for a reason, so I guess the shared reference has to go. (FWIW, this shared reference already caused other trouble, but that was fixed by #56161.)
yeah I guess just change it to *const [T] and toss in a PhantomData for the borrow. shrug
You mean, change this in Iter as well? Sure, that should work. In principle this reduces the precision of compiler analyses on VecDeque::iter().
ahh misread. maybe would just not use Iter then. depends on what the impl looks like now
Drain::dropcallssource_deque.wrap_copyto re-arrange stuff (I have not fully understood this yet), and in some cases this will end up writing to memory that the slice inIterpoints to.
But the first thing it does is self.for_each(drop); which should exhaust the Iter such that its ring is an empty slice. I suppose its pointer will still be somewhere in the deque's memory, but with length 0 it can't actually read anything. Is that not good enough to appease Miri?
But the first thing it does is self.for_each(drop); which should exhaust the Iter such that its ring is an empty slice. I suppose its pointer will still be somewhere in the deque's memory, but with length 0 it can't actually read anything. Is that not good enough to appease Miri?
No. This code still matches the pattern
fn foo(mut x: (T, &[U])) {
let ptr = &x.1[0] as *const U;
// do some stuff with x that "semantically" "exhausts" the slice.
x.1 = &[];
// then write to where `x.1` used to point to.
*(ptr as *mut U) = ...;
}There is no notion of "exhausting" or "consuming" a shared reference. When you create a shared reference with a certain lifetime, you promise that until that lifetime ends, no mutation happens. x here is mutable so you can change what the slice points to, but that has no effect at all on the guarantees that have been made about the value that was initially passed to foo.
Now, lifetimes are an entirely static concept, so for Stacked Borrows a "lifetime" generally is "until the pointer is used for the last time". However, for the special case of a reference being passed to a function, Stacked Borrows asserts that the "lifetime" will last at least as long as the function the reference is pointed to. (This is enforced by the "barriers" explained in this blog post.) This is extremely useful for optimizations.
ahh misread. maybe would just not use Iter then. depends on what the impl looks like now
Drain::{next, next_back, size_hint} forward to Iter. I wouldn't want to copy those.
Oh, I also didn't realize that Iter doesn't actually shrink down to an empty slice -- it only updates its indexes. It would otherwise have to use two slices for the wrapped head and tail. So that full shared reference does still exist after the iterator is exhausted. (This also raises the validity question of having references to uninitialized data.)
Could we base Drain on IterMut instead? We could arrange wrap_copy to work through that mutable reference too, so there's no question of mutable aliasing.
Could we base Drain on IterMut instead? We could arrange wrap_copy to work through that mutable reference too, so there's no question of mutable aliasing.
That might work. However, Drain::drop must not access deque.buf in that case -- that would be in conflict with the unique reference in IterMut.
Hmm, does Drain need to be covariant over T? I guess IterMut would break that.
Actually this idea of "consuming a shared ref" is far from unique to VecDeque... the following code also errors in Miri:
use std::cell::{RefCell, Ref};
fn break_it(rc: &RefCell<i32>, r: Ref<'_, i32>) {
// `r` has a shared reference, it is passed in as argument and hence
// a barrier is added that marks this memory as read-only for the entire
// duration of this function.
drop(r);
// *oops* here we can mutate that memory.
*rc.borrow_mut() = 2;
}
fn main() {
let rc = RefCell::new(0);
break_it(&rc, rc.borrow())
}As I understand it, Miri and stacked borrows should be "more powerful" than the Rust borrow checker, right?
So, is this an example where Miri does not match what Rust allows or is this a bug in Rust that could lead to undefined behaviour in safe code?
(Your example in the playground)
@Julius-Beides -- I believe @RalfJung is admitting that RefCell example as something that should be allowed. But this issue with VecDeque involves raw pointers, which rustc doesn't check at all, so miri is in uncharted waters here.
So, is this an example where Miri does not match what Rust allows or is this a bug in Rust that could lead to undefined behaviour in safe code?
This is up for us all to decide. I can't tell you. ;)
I believe @RalfJung is admitting that RefCell example as something that should be allowed.
Not necessary. I have argued that Ref/RefMut are problematic long before Stacked Borrows. The lifetimes there are, strictly speaking, wrong. I think it would be reasonable to rule that they have to use raw pointers.
FWIW here is a stand-alone version of the testcase to reproduce this with Miri (but it needs access to some private VecDeque internals):
use std::collections::VecDeque;
fn main() {
let mut tester: VecDeque<usize> = VecDeque::with_capacity(3);
let cap = tester.capacity();
for len in 0..=cap {
for tail in 0..=cap {
for drain_start in 0..=len {
for drain_end in drain_start..=len {
tester.tail = tail;
tester.head = tail;
for i in 0..len {
tester.push_back(i);
}
// Check that we drain the correct values
println!("Testing {} {} {} {}", len, tail, drain_start, drain_end);
let drained: VecDeque<_> = tester.drain(drain_start..drain_end).collect();
let drained_expected: VecDeque<_> = (drain_start..drain_end).collect();
assert_eq!(drained, drained_expected);
// We shouldn't have changed the capacity or made the
// head or tail out of bounds
assert_eq!(tester.capacity(), cap);
assert!(tester.tail < tester.cap());
assert!(tester.head < tester.cap());
// We should see the correct values in the VecDeque
let expected: VecDeque<_> = (0..drain_start)
.chain(drain_end..len)
.collect();
assert_eq!(expected, tester);
}
}
}
}
}See rust-lang/unsafe-code-guidelines#125 for the issue about whether Stacked Borrows should allow such code or not.
For now, with rust-lang/miri#872, Miri accepts this code again. So I am going to close the issue here. It still represents an interesting data point to inform what Stacked Borrows rules should be.
nods wisely
just gotta stand still long enough for the spec to bend around you
@gankro ;-)
I don't think the decision is made either way, but for now with several counterexamples in libstd, it is not helpful for Miri to complain about this. Plus, this matches what we formalized in Coq; the basic optimizations that we considered so far are all fine with this change.
Smaller reproduction that I found in #99701 (a now closed dupe of this issue) that doesn't rely on VecDeque internals, that fails now with -Zmiri-retag-fields
use std::collections::VecDeque;
fn main() {
let mut tester: VecDeque<usize> = VecDeque::with_capacity(3);
for i in 0..3 {
tester.push_back(i);
}
let _: VecDeque<_> = tester.drain(1..2).collect();
}-Zmiri-retag-fields re-enables the part of Stacked Borrows that got disabled (in parts) to work around the issue, before I knew that we actually have noalias here and are risking real miscompilations.