Segfault(s) with 5.5.0
aharpervc opened this issue · 3 comments
Hi, I'm in the process of upgrading a Rails app, and updated dogstatd-ruby from 3.3.0 to 5.5.0. The former version had been working fine without issue for years, but now with the upgraded version I'm now seeing a segfault. The only code that's changed is related to this gem. I'm on Rails 5.2.2.1 although hopefully that part shouldn't matter.
My code is basically this (somewhat simplified):
# config\initializers\datadog_instrumentation.rb
require 'datadog/statsd'
STATSD = Datadog::Statsd.new('localhost', 8125, buffer_flush_interval: 60)
class DataDogInstrumentation
def self.notify app, event, *data
STATSD.count "#{app}.#{event}", data.first[:count]
end
end
# newly added to follow the v5 requirement of always calling `close`
at_exit do
STATSD.close
end
# code elsewhere in the Rails app does this (simplified)
DataDogInstrumentation.notify "my_app", "my_event", count: 5
Here's the log (slightly edited & truncated for brevity; line 30 in the original code is STATSD.close
in the at_exit
block):
[BUG]
Segmentation fault at 0x00005563f7bb5f21
ruby 2.5.3p105 (2018-10-18 revision 65156) [x86_64-linux]
-- Ruby level backtrace information ----------------------------------------
/my-apps-path-here/releases/20220923182902/config/initializers/datadog_instrumentation.rb:30:in `block in <main>'
/my-apps-path-here/shared/bundle/ruby/2.5.0/gems/dogstatsd-ruby-5.5.0/lib/datadog/statsd.rb:380:in `close'
/my-apps-path-here/shared/bundle/ruby/2.5.0/gems/dogstatsd-ruby-5.5.0/lib/datadog/statsd/forwarder.rb:109:in `close'
/my-apps-path-here/shared/bundle/ruby/2.5.0/gems/dogstatsd-ruby-5.5.0/lib/datadog/statsd/sender.rb:107:in `stop'
/my-apps-path-here/shared/bundle/ruby/2.5.0/gems/dogstatsd-ruby-5.5.0/lib/datadog/statsd/timer.rb:37:in `stop'
/my-apps-path-here/shared/bundle/ruby/2.5.0/gems/dogstatsd-ruby-5.5.0/lib/datadog/statsd/timer.rb:37:in `synchronize'
/my-apps-path-here/shared/bundle/ruby/2.5.0/gems/dogstatsd-ruby-5.5.0/lib/datadog/statsd/timer.rb:38:in `block in stop'
/my-apps-path-here/shared/bundle/ruby/2.5.0/gems/dogstatsd-ruby-5.5.0/lib/datadog/statsd/timer.rb:38:in `signal'
That's this code:
dogstatsd-ruby/lib/datadog/statsd/timer.rb
Lines 33 to 42 in 5cb9a87
I thought maybe there was something weird going on with the at_exit
block, so I removed that part for testing, but I'm still getting a segfault. The stack trace is a bit different (again, somewhat edited/snipped for clarity/brevity):
/my-apps-path-here/shared/bundle/ruby/2.5.0/gems/dogstatsd-ruby-5.5.0/lib/datadog/statsd/sender.rb:76:
[BUG]
Segmentation fault at 0x0000562bf467b1b5
ruby 2.5.3p105 (2018-10-18 revision 65156) [x86_64-linux]
-- Ruby level backtrace information ----------------------------------------
/my-apps-path-here/releases/20220923194611/config/initializers/datadog_instrumentation.rb:13:in `notify'
/my-apps-path-here/shared/bundle/ruby/2.5.0/gems/dogstatsd-ruby-5.5.0/lib/datadog/statsd.rb:162:in `increment'
/my-apps-path-here/shared/bundle/ruby/2.5.0/gems/dogstatsd-ruby-5.5.0/lib/datadog/statsd.rb:190:in `count'
/my-apps-path-here/shared/bundle/ruby/2.5.0/gems/dogstatsd-ruby-5.5.0/lib/datadog/statsd.rb:430:in `send_stats'
/my-apps-path-here/shared/bundle/ruby/2.5.0/gems/dogstatsd-ruby-5.5.0/lib/datadog/statsd/forwarder.rb:75:in `send_message'
/my-apps-path-here/shared/bundle/ruby/2.5.0/gems/dogstatsd-ruby-5.5.0/lib/datadog/statsd/sender.rb:70:in `add'
/my-apps-path-here/shared/bundle/ruby/2.5.0/gems/dogstatsd-ruby-5.5.0/lib/datadog/statsd/sender.rb:70:in `synchronize'
/my-apps-path-here/shared/bundle/ruby/2.5.0/gems/dogstatsd-ruby-5.5.0/lib/datadog/statsd/sender.rb:76:in `block in add'
/my-apps-path-here/shared/bundle/ruby/2.5.0/gems/dogstatsd-ruby-5.5.0/lib/datadog/statsd/sender.rb:76:in `close'
That's this:
dogstatsd-ruby/lib/datadog/statsd/sender.rb
Lines 63 to 89 in 5cb9a87
So, different code, but line 76 is message_queue.close if CLOSEABLE_QUEUES
, so maybe the same root cause if it's related to closing something.
At first glance this seems like an issue with dogstatsd-ruby itself, especially since there had been no issues at all on the prior version.
Thoughts?
A workaround suggested by DataDog technical support was to add the single_thread: true
option to the initializer. This does appear to make the segfault(s) go away.
Thanks for the detailed report! I was able to reproduce this crash with the following ruby program:
require './datadog_instrumentation.rb' # using your snippet from the first post untouched
def do_work
puts "Counting... " + Time.new.to_s
DataDogInstrumentation.notify "my_app", "my_event", count: 5
end
def crash
do_work
sleep 1
Process.fork() # leak the child
do_work
Process.exit()
end
def dontcrash
do_work
sleep 1
do_work
Process.exit()
end
puts ARGV.first
if ARGV.first == "crash"
crash
else
dontcrash
end
With uname -a
:
Linux dev-ruby-repro 4.15.0-193-generic #204-Ubuntu SMP Fri Aug 26 19:25:57 UTC 2022 aarch64 aarch64 aarch64 GNU/Linux
I'm able to see a crash almost every single time running ruby main.rb crash
against ruby 2.5.3
.
Upgrading to ruby 2.5.9
fixes the crash for me, I'm completely unable to reproduce it.
(Note I'm testing on a slightly different platform, aarch64
vs amd64
, so you may see slightly different results)
I would recommend upgrading ruby to2.5.9
version of ruby (or newer, 2.5
and 2.6
are both EOL according to this)
If this is not an option, then single_thread: true
should help, however this doesn't completely remove the crash for me, it just makes it less frequent.