Lock analysis is not prepared to handle SnatchLock usage

Question

Lock analysis is not prepared to handle SnatchLock usage

jimblandy opened this issue 2 months ago · 4 comments

In #5475, we worked around a deadlock by passing a SnatchGuard by value to maintain. However, this means that SnatchGuards and other kinds of lock guards are not acquired and dropped in a stack-like fashion, which is an assumption made by the locking analysis in lock::ranked.

Specifically, Global::queue_submit acquires a SnatchGuard on the device, and then acquires a read guard on Device::fence. However, the SnatchGuard is passed by value to Device::maintain, which drops it, while the read guard goes out of scope in queue_submit after maintain has returned.

Thus, the SnatchGuard is both acquired first and dropped first, which is a violation of the stacking order, which the lock rank checker assumes.

Kudos to @ErichDonGubler for noting in the review of #5539 that the checker assumes stack-like acquisition and dropping, and suggesting that we actually check that this is the case.

Answer 1 · 2024-04-26T02:28:07.000Z

It does seem that SnatchGuard is the only thing that is not handled in a stack-like fashion.

Answer 2 · 2024-04-27T00:20:44.000Z

The analysis could be changed to handle arbitrary drop order.

Require that the ranks in the define_lock_ranks! macro call appear in order of increasing rank. (At the moment, ranks can appear in any order, but it would probably be clearer to have them in order anyway.)
Let the per-thread state be:
- a LockRankSet in which bits are set and cleared as locks of the given rank are acquired and released, and
- an array of Option<Location> values indexed by rank bit number, indicating where each currently held lock was acquired.
Acquisition applies u64::ilog2 to the LockRankSet to determine the highest-ranked lock currently held, and look that up in a global table of ranks built by the define_lock_ranks! macro to see which new locks are allowed to be acquired.
Lock guards track only their lock's rank bit, so that they can clear it when they're dropped.

This preserves the property that threads only acquire locks listed as a follower of the highest-ranked lock currently held, so the chain of threads blocked on other threads always follows a path of locks of increasing rank, which is the property that ensures there is an unblocked thread, holding the lock of highest rank.

If you squint, the current implementation maintains a stack of states, where each lock guard holds the state to be restored when it is dropped. This replaces that scattered stack with a single bitmask, and uses ilog2 to scan the bitmask to find the maximum-ranked held lock to validate an acquisition.

Answer 3 · 2024-04-27T00:21:01.000Z

But it might be easier to just change the code to remain stack-like.

Answer 4 · 2024-04-27T01:51:17.000Z

But it might be easier to just change the code to remain stack-like.

I agree we should try to get back to stack like.