Data race in thread::scope
RalfJung opened this issue · 4 comments
The following code sometimes causes a data race to be reported by Miri:
#![feature(atomic_from_mut, inline_const)]
use std::sync::atomic::{AtomicBool, Ordering};
fn main() {
let mut some_bools = [const { AtomicBool::new(false) }; 10];
let view: &mut [bool] = AtomicBool::get_mut_slice(&mut some_bools);
assert_eq!(view, [false; 10]);
view[..5].copy_from_slice(&[true; 5]);
std::thread::scope(|s| {
for t in &some_bools[..5] {
s.spawn(move || assert_eq!(t.load(Ordering::Relaxed), true));
}
for f in &some_bools[5..] {
s.spawn(move || assert_eq!(f.load(Ordering::Relaxed), false));
}
});
}
Miri reports:
note: tracking was triggered
--> /home/r/.rustup/toolchains/miri/lib/rustlib/src/rust/library/std/src/thread/scoped.rs:133:17
|
133 | let scope = Scope {
| _________________^
134 | | data: ScopeData {
135 | | num_running_threads: AtomicUsize::new(0),
136 | | main_thread: current(),
... |
140 | | scope: PhantomData,
141 | | };
| |_____^ created allocation with id 1082
|
= note: inside `std::thread::scope::<[closure@atomic.rs:12:20: 20:2], ()>` at /home/r/.rustup/toolchains/miri/lib/rustlib/src/rust/library/std/src/thread/scoped.rs:133:17
note: inside `main` at atomic.rs:12:1
--> atomic.rs:12:1
|
12 | / std::thread::scope(|s| {
13 | | for t in &some_bools[..5] {
14 | | s.spawn(move || assert_eq!(t.load(Ordering::Relaxed), true));
15 | | }
... |
19 | | }
20 | | });
| |__^
error: Undefined Behavior: Data race detected between Deallocate on Thread(id = 0, name = "main") and Read on Thread(id = 2) at alloc1082+0x8 (current vector clock = VClock([114, 8, 8, 8, 8, 8, 8, 8, 8, 8, 8]), conflicting timestamp = VClock([113, 0, 9, 9, 9, 0, 0, 0, 0, 0, 9]))
--> atomic.rs:12:1
|
12 | / std::thread::scope(|s| {
13 | | for t in &some_bools[..5] {
14 | | s.spawn(move || assert_eq!(t.load(Ordering::Relaxed), true));
15 | | }
... |
19 | | }
20 | | });
| |__^ Data race detected between Deallocate on Thread(id = 0, name = "main") and Read on Thread(id = 2) at alloc1082+0x8 (current vector clock = VClock([114, 8, 8, 8, 8, 8, 8, 8, 8, 8, 8]), conflicting timestamp = VClock([113, 0, 9, 9, 9, 0, 0, 0, 0, 0, 9]))
|
= help: this indicates a bug in the program: it performed an invalid operation, and caused Undefined Behavior
= help: see https://doc.rust-lang.org/nightly/reference/behavior-considered-undefined.html for further information
= note: inside `main` at atomic.rs:12:1
The deallocation occurs when scope
returns and its local variable scope
gets deallocated.
The read occurs here (I am fairly sure):
rust/library/std/src/thread/scoped.rs
Lines 58 to 59 in 7098a71
This reads the contents of the field self.main_thread
after the fetch_sub
. But the moment the fetch_sub
is done, the memory backing self
might be deallocated. That makes the read a potential use-after-free, and a race with the deallocation.
Cc @m-ou-se
The following patch fixes this:
diff --git a/library/std/src/thread/scoped.rs b/library/std/src/thread/scoped.rs
index a387a09dc8b..06019cbc75b 100644
--- a/library/std/src/thread/scoped.rs
+++ b/library/std/src/thread/scoped.rs
@@ -55,8 +55,10 @@ pub(super) fn decrement_num_running_threads(&self, panic: bool) {
if panic {
self.a_thread_panicked.store(true, Ordering::Relaxed);
}
+ // The moment we do the `fetch_sub`, the memory we ourselves are stored in can be deallocated.
+ let main_thread = self.main_thread.clone();
if self.num_running_threads.fetch_sub(1, Ordering::Release) == 1 {
- self.main_thread.unpark();
+ main_thread.unpark();
}
}
}
However, even with this patch, we still run into a problem like #55005: the &self
gets deallocated while decrement_num_running_threads
still runs. This is still UB even after my proposed spec change because self
contains a field that is not inside an UnsafeCell
(the main_thread
field). So that part of what &self
points to is expected to stay live for the entire duration of the function call.
Another alternative might be to put the entire Scope
into an Arc
rather than allocating it on the stack frame of scope
.
Going to attach this to the 1.63 milestone, since this is in a newly stabilized feature -- we may wish to back out stabilization for a cycle or backport fixes.
Re-opening for beta backport.