Synchronization layer
pitr-ch opened this issue · 12 comments
Hello,
I would like to build a common synchronization layer in concurrent-ruby
having following API:
def Concurrent.lock(object); end #reentrant
def Concurrent.try_lock(object); end
def Concurrent.unlock(object); end
def Concurrent.synchronize(object, &block); end
def Concurrent.wait(object, timeout = nil); end # as Java Object#wait
def Concurrent.signal(object); end # as Java Object#notify
def Concurrent.broadcast(object); end # as Java Object#notifyAll
Which would have different implementation based on Ruby interpreter.
I kindly ask @brixen @headius @hone @ko1 @yorickpeterse @zzak to help with finding the best options for the different implementations. This PR is intended to serve to discus the options and issues, I'll open PR with implementation later to discuss details based on what is figured out here.
Proposed implementations
MRI
Injecting Monitor to an object, something like:
require 'monitor'
INJECTION_BARRIER = Mutex.new
def synchronize(object)
get_mutex(object).synchronize { yield }
end
def lock(object)
get_mutex(object).enter
end
def unlock(object)
get_mutex(object).exit
end
def get_mutex(object)
object.instance_variable_get(:@__monitor__) or INJECTION_BARRIER.synchronize do
object.instance_variable_get(:@__monitor__) or
object.instance_variable_set(:@__monitor__, Monitor.new)
end
end
Similarly a ConditionVariable
would be injected to support wait
, signal
, broadcast
.
JRuby
Delegate to methods on JRuby
module, see this PR jruby/jruby#2594.
It uses wait
, notify
, notifyAll
methods on Java object to implement wait
, signal
, broadcast
. synchronize
runs a Block
wrapped in synchronized
Java keyword. lock
, unlock
, try_lock
are using methods monitorEnter
, monitorExit
, tryMonitorEnter
on sun.misc.Unsafe
.
Rubinius
Implement lock
, unlock
, try_lock
and synchronize
by delegation to same methods on Rubinius
module. wait
, signal
, broadcast
would be supported by injecting ConditionalVariable
into object's instance variable (when locked).
lock
, unlock
, try_lock
are not really needed for concurrent-ruby
so far, we will be fine with synchronize
so they can be omitted if it turns out to be a bad idea to use sun.misc.Unsafe
for its implementation on JRuby.
Rubinius currently provides the following methods for this:
Rubinius.lock
Rubinius.try_lock
Rubinius.locked?
Rubinius.unlock
Rubinius.lock_timed
(can be used to time out after waiting for a lock to release)Rubinius.memory_barrier
Rubinius.synchronize
Rubinius.uninterrupted_lock
(currently only used byThread#__run__
here: https://github.com/rubinius/rubinius/blob/54544961d044a93f5b67dd87fb43c4b923bcc811/kernel/bootstrap/thread.rb#L411Rubinius.check_interrupts
The last two are mainly useful for thread scheduling/management and not so much for synchronization of arbitrary objects.
Some examples of these synchronization methods can be found in the Mutex class: https://github.com/rubinius/rubinius/blob/54544961d044a93f5b67dd87fb43c4b923bcc811/kernel/common/mutex.rb
Generally I'd say it's better to use implementation specific APIs here versus injecting a Mutex/Monitor. Given you inject objects you'll need a name for the instance variable, and that's bound to conflict with something sooner or later unless you'd call it something like @__concurrentruby_really_this_is_our_private_mixin_instance_variable_dont_touch_it_
. Also, the injecting of objects might have to be synchronized itself, I'd have to look if we have any barriers in place in Rubinius to mitigate that need.
@chrisseaton gave me a good question on concurrent-ruby chat: "What is the motivation?" Sorry for omitting in the first place, let me fix that.
There are essentially 2 reasons for this: First is performance, allowing the best option to be used on any given Ruby implementation.
Second is following code which is not guaranteed to be safe on JRuby and Rubinius, AFAIK.
def initialize
@mutex = Mutex.new
end
def do_something
@mutex.synchronize { something }
end
There are no locks around method calls and instance variable access is not volatile, so if the object is initialized one thread and do_something
called on another there is no guarantee that the @mutex
will be visible. If you look into JRuby source code the instance variable access actually has volatile behavior [1] but the documentation says otherwise, meaning the implementation may change. Regarding Rubinius I am little more in the dark, based on my googling and code digging it looks the problem applies too.
This can be solved by using Rubinius.synchronize
or JRuby.reference0(object).synchronized
to synchronize around the object without need to access mutex in instance variable. Or with the layer in place by def do_something; Concurrent.synchronize(self) { something }; end
.
[1] https://github.com/jruby/jruby/blob/master/core/src/main/java/org/jruby/runtime/ivars/StampedVariableAccessor.java https://github.com/jruby/jruby/blob/master/core/src/main/java/org/jruby/runtime/ivars/SynchronizedVariableAccessor.java
The code example @pitr-ch shows, with a mutex being created in the constructor, is something we do all the time in this gem. I'll be honest that these low-level implementation details aren't my forte. My position is that if the aforementioned code sample cannot provide guaranteed behavior across the three major interpreters, then we should do something about it. The common synchronization layer being proposed makes sense to me, but I trust the judgement of @pitr-ch and @chrisseaton.
I'm fairly sure that the above snippet is thread-safe at least on Rubinius as creating of instance variables is not deferred to the background (or otherwise done in a non blocking manner). Also, since the variable is set up in the initialize
method it should be fully set up before any thread can even use it.
The following case however could be problematic depending on the implementation:
def do_something
mutex.synchronize { something }
end
def mutex
@mutex ||= Mutex.new # Unless ||= is atomic N threads could attempt this at the same time
end
@yorickpeterse what you are saying makes sense with sequential consistency, but unless there is some kind of synchronisation or memory barrier, what makes you think that the write to @mutex
will always be observed before the write to whatever variable or other location stores the object that contains the mutex, from the perspective of another thread? You may have written to A and then B, but another thread might see the write to B, but not A yet.
@yorickpeterse Thanks for your response!
I agree with you that it's better to use the API and I am suggesting the same for lock
, unlock
, try_lock
and synchronize
. But I could not find any methods on Rubinius
which would directly support wait
, signal
, broadcast
. ConditionVariable
implementation on Rubinius uses Rubinius::Channel
.
I like the variable name :) yeah it's not a perfect solution to inject into the objects. I am looking for a better one.
@pitr-ch Rubinius does not use any memory barrier for instance variables so they do not have volatile (Java definition) semantics.
We would not want to see ruby-concurrency injecting anything into objects on Rubinius. I can add the wait
, signal
, broadcast
API that you listed as it's quite consistent with the existing API we have with eg Rubinius.lock(object)
, etc.
I agree injecting is not a good solution. Thanks @brixen for the offer to extend the Rubinius API! That will be great!
Just to add more context for possible readers.
There was a Concurrent::Synchronization::Object
created which provides methods synchronize
, wait
, signal
, broadcast
and other utilities to create volatile and final fields. concurrent-ruby
classes then usually inherit from this class. Concurrent::Synchronization::Object
has different implementations based on the particular behavior of a given Ruby implementation (or version). (This approach avoids any ugly ivar injections, as discussed in this issue.)
I'll finish better documentation for the Synchronization::Object
first so it's more understandable and does not force anybody to dig through source code. Then I plan to contact Ruby implementors to help verify correctness of this layer. After that it'll be released in 0.9
and stabilized for 1.0
release. Tracked in #292.
👍