Lock analysis is not prepared to handle SnatchLock usage
jimblandy opened this issue · 4 comments
In #5475, we worked around a deadlock by passing a SnatchGuard
by value to maintain
. However, this means that SnatchGuard
s and other kinds of lock guards are not acquired and dropped in a stack-like fashion, which is an assumption made by the locking analysis in lock::ranked
.
Specifically, Global::queue_submit
acquires a SnatchGuard
on the device, and then acquires a read guard on Device::fence
. However, the SnatchGuard
is passed by value to Device::maintain
, which drops it, while the read guard goes out of scope in queue_submit
after maintain
has returned.
Thus, the SnatchGuard
is both acquired first and dropped first, which is a violation of the stacking order, which the lock rank checker assumes.
Kudos to @ErichDonGubler for noting in the review of #5539 that the checker assumes stack-like acquisition and dropping, and suggesting that we actually check that this is the case.
It does seem that SnatchGuard
is the only thing that is not handled in a stack-like fashion.
The analysis could be changed to handle arbitrary drop order.
- Require that the ranks in the
define_lock_ranks!
macro call appear in order of increasing rank. (At the moment, ranks can appear in any order, but it would probably be clearer to have them in order anyway.) - Let the per-thread state be:
- a
LockRankSet
in which bits are set and cleared as locks of the given rank are acquired and released, and - an array of
Option<Location>
values indexed by rank bit number, indicating where each currently held lock was acquired.
- a
- Acquisition applies
u64::ilog2
to theLockRankSet
to determine the highest-ranked lock currently held, and look that up in a global table of ranks built by thedefine_lock_ranks!
macro to see which new locks are allowed to be acquired. - Lock guards track only their lock's rank bit, so that they can clear it when they're dropped.
This preserves the property that threads only acquire locks listed as a follower of the highest-ranked lock currently held, so the chain of threads blocked on other threads always follows a path of locks of increasing rank, which is the property that ensures there is an unblocked thread, holding the lock of highest rank.
If you squint, the current implementation maintains a stack of states, where each lock guard holds the state to be restored when it is dropped. This replaces that scattered stack with a single bitmask, and uses ilog2
to scan the bitmask to find the maximum-ranked held lock to validate an acquisition.
But it might be easier to just change the code to remain stack-like.
But it might be easier to just change the code to remain stack-like.
I agree we should try to get back to stack like.