should the thread_local_var_accessor gem be included within the concurrent-ruby gem?
Opened this issue · 4 comments
I've written a little gem that makes concurrent-ruby's thread-local variables easy to use, in the same manner of attr_reader, attr_writer, attr_accessor: it is called: thread_local_var_accessors
See:
- https://github.com/aks/thread_local_var_accessors
- https://rubygems.org/gems/thread_local_var_accessors
Eg:
tlv_reader :var1
tlv_writer :var2
tlv_accessor: :var3
I've been using it in my company software for some time now. It is an easy way to encourage developers to improve the thread-safetyness of their code, especially code that runs in a multi-threaded environment (eg: sidekiq tasks).
I'm happy to continue maintaining it separately, but I'm also happy to contribute it to the core concurrent-ruby gem, if that's appropriate. In the latter case, since we are already using concurrent-ruby elsewhere, for other reasons, it would mean one gem less to track for us. 😄
Thanks for writing up the proposal.
I don't know how often this is useful, for example if it's a ThreadLocalVar which can be stored e.g. in a constant/globally there is no need for this.
I took a quick look at the code in https://github.com/aks/thread_local_var_accessors.
One thing which does not seem thread-safe is https://github.com/aks/thread_local_var_accessors/blob/d5d4fd3418d32c1901f719da603f652b7f1cea17/lib/thread_local_var_accessors.rb#L280-L285 specifically for:
class Foo
tlv_accessor: :tvar
def initialize
end
def foo
self.tvar = true
p tvar
end
foo = Foo.new
8.times.map { Thread.new { foo.foo } }Some threads might print nil instead of true because multiple threads might assign @tvar and so overwrite each other.
Given this seems mostly convenience on top of ThreadLocalVar to be a bit shorter (e.g. tvar/self.tvar = v vs @tvar.value/@tvar.value = v), I think it's best as its own gem.
And the fact the ThreadLocalVar is created more magically/behind-the-scenes creates the race mentioned above.
Although that could be solved e.g. by requiring to assign the ivar in initialize, and raising an exception on read/write if it's not the case. (or using a global or per-class lock but that would create lots of contention so not a good solution)
I corrected your test example above, and ran it with thousands of threads. At no point was there ever any error.
However, I'm fine keeping the library in a separate gem.
Thanks for your reply and review.
Here's the updated test program and a sample output:
% cat tlvtest.rb
#!/usr/bin/env ruby
require 'thread_local_var_accessors'
class Foo
include ThreadLocalVarAccessors
tlv_accessor :tvar
def initialize
end
def foo(num)
self.tvar = num
tvar
end
end
obj = Foo.new
count = ARGV.shift&.to_i || 100
puts "Count: #{count}"
data_in = count.times.to_a
data_out = data_in.map { |num| Thread.new { obj.foo(num) }.run.value }
diffs = data_in - data_out
if diffs.size.zero?
puts "no diffs"
else
puts "Diffs: #{diffs}"
end
And, here's the output:
% $ time ./tlvtest.rb 100000
Count: 100000
no diffs
./tlvtest.rb 100000 2.11s user 1.47s system 113% cpu 3.145 totalApologies for reviving a thread (and for a long post)
@aks Your test program uses Thread#run which joins the thread implicitly, making code in essence sequential. It needs to be something like this:
data_in = count.times.to_a
threads = data_in.map { |num| Thread.new { obj.foo(num) } }
loop { break if threads.none?(&:alive?) }
data_out = threads.map(&:value)The code does seem to be thread-safe:
$ ruby tlvtest.rb 10000
Count: 10000
no diffsHowever, this is still not a proper test. Short-lived threads are pretty much guaranteed to run to completion without exhibiting races (at least on CRuby). For example, this modified test "proves" that instance variables are thread-safe:
#!/usr/bin/env ruby
require "thread_local_var_accessors"
class Foo
include ThreadLocalVarAccessors
tlv_accessor :tvar
def foo(num)
self.tvar = num
tvar
end
end
class Bar
def foo(num)
@var = num
@var
end
end
count = ARGV.shift&.to_i || 100
puts "Count: #{count}"
$data_in = count.times.to_a
def run_test(name, klass)
puts "#{name}:"
obj = klass.new
threads = $data_in.map { |num| Thread.new { obj.foo(num) } }
loop { break if threads.none?(&:alive?) }
data_out = threads.map(&:value)
diffs = $data_in - data_out
if diffs.empty?
puts "no diffs"
else
puts "Diffs: #{diffs}"
end
end
run_test("Foo (TVar)", Foo)
run_test("Bar (Instance Var)", Bar)Output:
$ ruby tlvtest.rb 100000
Count: 100000
Foo (TVar):
no diffs
Bar (Instance Var):
no diffsIf the #foo method is modified to explicitly allow passing execution to a different thread, suddenly we get problems:
class Foo
include ThreadLocalVarAccessors
tlv_accessor :tvar
def foo(num)
self.tvar = num
Thread.pass
tvar
end
end
class Bar
def foo(num)
@var = num
Thread.pass
@var
end
endOutput:
ruby tlvtest.rb 10
Count: 10
Foo (TVar):
no diffs
Bar (Instance Var):
Diffs: [0, 1, 2, 3, 4, 5, 6, 9]Be aware however that this doesn't really prove or disprove thread-safety of your code, you would need to test it with modified inner workings, by making it possible to switch threads in problematic spots. I, personally, think that it should be safe, but I'm no expert.