GlobalQueue must use Scheduler's Mutex
Closed this issue · 5 comments
GlobalQueue uses its own Crystal::SpinLock
to protect mutations of its internal linked list of fibers, while the schedulers use their own Thread::Mutex to park themselves.
This creates a race condition with cross context enqueues:
- Context A: has nothing to do
- Context B: enqueues a fiber into Context A
- Context A: locks the mutex
- Context A: checks the global queue: lock -> empty? -> unlock
- Context B: global queue (of Context A) -> lock -> push -> unlock
- Context A: waits on the condition variable
- Context A: blocked until another fiber is enqueued
Solution: we must have a single lock (Thread::Mutex) to protect both GlobalQueue and to park a scheduler thread.
Since OS mutex are kinda slow in general, there may be some usefulness in a Go-like futex based mutex (on linux) after all: https://gist.github.com/ysbaddaden/7e99734c0ca46cbc7c033060cc90be24
Which means we'll need to have our own condition variable too (futex based?)
Windows has futex as well: https://learn.microsoft.com/en-us/windows/win32/api/synchapi/nf-synchapi-waitonaddress
Even macOS, although only in the latest version: https://developer.apple.com/documentation/os/4316531-os_sync_wait_on_address
And several BSDs seem to have it as well.
Of course we don't need all of them at once and can fall back to mutex. It's nice to know we can expand into further OSes.
Oooh nice!
An alternative would be to avoid the lazy return if @parked == 0
; it would remove lazy accesses of @parked
, but means we must lock the mutex just to know if a thread is parked which sounds inefficient (especially on local enqueues where all threads are running).
On the other hand, having a single Mutex means we'll lock the mutex to enqueue to GQ, then maybe have to lock it again to wake a scheduler, but would only do the second lock when there is a potential thread to wakeup (and @parked
may need to be an atomic).
We want to avoid locks as much as possible. It's already unfortunate that cross context enqueues need to grab a lock, we must avoid locking for local enqueues as much as possible, so having a single mutex and the ability to check for parked threads without grabbing a lock sounds better.