Swap/store operation is not lock-free
Diggsey opened this issue · 6 comments
The definition from Wikipedia for lock-freedom is this:
In particular, if one thread is suspended, then a lock-free algorithm guarantees that the remaining threads can still make progress. Hence, if two threads can contend for the same mutex lock or spinlock, then the algorithm is not lock-free. (If we suspend one thread that holds the lock, then the second thread will block.)
In the current implementation, if any reader is suspended whilst holding the lock, all writers will be starved, so only the load
operation is lock-free.
Yes, right… also probably very related to this from reddit:
...provided no threads receive the signal in the middle of a load, right? You'd have to pthread_sigmask the signal out prior to calling load (and can unmask it afterward).
(hinting that putting store inside a signal handler is probably not good). I guess I'll update the documentation first and then try to come up with some better solution.
In general, you need space proportional to the number of accessors for truly lock/wait free algorithms. That means either doing memory allocation, or restricting the number of accessors to a predetermined limit.
In the past I've implemented concurrent data structures by building-in Arc
-like handles which either use a RwLock
and resize their interior when cloned (meaning that cloning a reference to your lock-free data structure is not lock free, but all operations on a pre-existing handle can be made lock-free) or simply panic/fail to clone if a predetermined limit is reached. This obviously has a performance impact though.
Another option would be to use a fixed limit, but protect it with a semaphore, and maybe have a separate slower Handle
type which is not clone-able but has guaranteed lock-free access (by always holding the semaphore).
The semaphore solution is quite neat because it allows you a high degree of control over the amount of contention allowed on the data structure, so there's a scale from being completely lock free to completely contention free (with a binary semaphore).
For now I've updated the docs and the blog post, so they are not misleading.
I'll try to think about how to make swap lock-free too, but maybe I'll just decide I want load as fast as possible (and lock-free), but swap doesn't matter that much. I aim at read-heavy scenarios, mostly, where updates are rather rare.
I've done some experiments with other approaches and they turn out slower in practice. As there's another crate in making (https://github.com/stjepang/atomic) which'll offer structure with similar interface, but using hazard pointers and will have both load and store lock free, I've decided to leave it the way it is, so users have the possibility to choose between the approaches.
I'll keep it open, though, in case someone comes up with an idea how to improve the situation without losing wait-freeness of load or its performance.
Would you consider adding the lock free version an another stuct in your crate?
That depends on circumstances and what exactly you ask about.
If you have a pull request (or plan on writing one) and there would be some significant code reuse between these two structures to warrant living in the same crate, then possibly.
If you ask me to write another implementation in addition to what I've already written (which would be most probably the hazard pointers), then no. I don't have the time and motivation to do so.
However, if you need a data structure that is lock-free on both load and store/balanced in performance, I'd suggest you volunteer to help finish the atomic crate linked above ‒ I believe it is still not released simply due to lack of time, but it works ‒ it probably needs some more polishing touches.