salsa-rs/salsa

[Bug] Deadlock may happen in cancel_other_workers even when there's only one handle

Chronostasys opened this issue · 0 comments

I've encountered a deadlock in salsa in cancel_other_workers (line 151 below) even when there are no other active snapshots.

This may have occurred when there's no other active handle.

The problem is the misusage of condvar api. When waiting for signals of changing jars, jar itself needs to be protected by a lock (which is later passed to condver), and so it is in Drop.

/// Sets cancellation flag and blocks until all other workers with access
/// to this storage have completed.
///
/// This could deadlock if there is a single worker with two handles to the
/// same database!
fn cancel_other_workers(&mut self) {
loop {
self.runtime.set_cancellation_flag();
// If we have unique access to the jars, we are done.
if Arc::get_mut(self.shared.jars.as_mut().unwrap()).is_some() {
return;
}
// Otherwise, wait until some other storage entities have dropped.
// We create a mutex here because the cvar api requires it, but we
// don't really need one as the data being protected is actually
// the jars above.
//
// The cvar `self.shared.cvar` is notified by the `Drop` impl.
let mutex = parking_lot::Mutex::new(());
let mut guard = mutex.lock();
self.shared.cvar.wait(&mut guard);
}
}