open-telemetry/opentelemetry-ruby

Spans from Sidekiq workers are missing after upgrading to `v1.2.1`

jdeff opened this issue · 6 comments

jdeff commented

Description of the bug

After upgrading only the opentelemetry-sdk gem from v1.2.0 to v1.2.1, I started seeing a lot of "Calling finish on an ended Span." warnings from Sidekiq. I also found that I am only receiving a single span (the parent) for all of my Sidekiq workers. After reverting the SDK back down to v1.2.0 these problems went away.

The only thing that looks to have changed in 1.2.1 was #1412.

Share details about your runtime

Operating system details: macOS 13.4.1 (22F82)
RUBY_ENGINE: "ruby"
RUBY_VERSION: "3.2.2"
RUBY_DESCRIPTION: "ruby 3.2.2 (2023-03-30 revision e51014f9c0) [arm64-darwin22]"

jdeff commented

Here are is a simple example. You will need to have redis running locally and available on the default port.

Gemfile

# frozen_string_literal: true

source 'https://rubygems.org'

gem 'opentelemetry-api'
gem 'opentelemetry-common'
gem 'opentelemetry-instrumentation-base'
gem 'opentelemetry-instrumentation-sidekiq'
gem 'opentelemetry-sdk'

gem 'sidekiq'

Gemfile.lock

GEM
  remote: https://rubygems.org/
  specs:
    concurrent-ruby (1.2.2)
    connection_pool (2.4.1)
    opentelemetry-api (1.2.0)
    opentelemetry-common (0.19.7)
      opentelemetry-api (~> 1.0)
    opentelemetry-instrumentation-base (0.22.1)
      opentelemetry-api (~> 1.0)
      opentelemetry-registry (~> 0.1)
    opentelemetry-instrumentation-sidekiq (0.24.1)
      opentelemetry-api (~> 1.0)
      opentelemetry-common (~> 0.19.3)
      opentelemetry-instrumentation-base (~> 0.22.1)
    opentelemetry-registry (0.3.0)
      opentelemetry-api (~> 1.1)
    opentelemetry-sdk (1.2.1)
      opentelemetry-api (~> 1.1)
      opentelemetry-common (~> 0.19.3)
      opentelemetry-registry (~> 0.2)
      opentelemetry-semantic_conventions
    opentelemetry-semantic_conventions (1.10.0)
      opentelemetry-api (~> 1.0)
    rack (3.0.8)
    redis-client (0.14.1)
      connection_pool
    sidekiq (7.1.2)
      concurrent-ruby (< 2)
      connection_pool (>= 2.3.0)
      rack (>= 2.2.4)
      redis-client (>= 0.14.0)

PLATFORMS
  arm64-darwin-22

DEPENDENCIES
  opentelemetry-api
  opentelemetry-common
  opentelemetry-instrumentation-base
  opentelemetry-instrumentation-sidekiq
  opentelemetry-sdk
  sidekiq

BUNDLED WITH
   2.4.10

sidekiq_example.rb

# frozen_string_literal: true

require "rubygems"
require "bundler/setup"

Bundler.require

# Export traces to console by default
ENV["OTEL_TRACES_EXPORTER"] ||= "console"

OpenTelemetry::SDK.configure do |c|
  c.use "OpenTelemetry::Instrumentation::Sidekiq"
end

# A basic Sidekiq job worker example
class SimpleJob
  include Sidekiq::Worker

  def perform
    OpenTelemetry.tracer_provider.tracer("sidekiq_example", "0.0.1").in_span("foo") do
      puts "Workin'"
    end
  end
end

if ARGV[0] == "client"
  SimpleJob.perform_async
end
jdeff commented

1. Queue the job

❯ ruby sidekiq_example.rb client

Output

I, [2023-07-18T16:03:06.400647 #9479]  INFO -- : Instrumentation: OpenTelemetry::Instrumentation::Sidekiq was successfully installed with the following options {:span_naming=>:queue, :propagation_style=>:link, :trace_launcher_heartbeat=>false, :trace_poller_enqueue=>false, :trace_poller_wait=>false, :trace_processor_process_one=>false, :peer_service=>nil}
2023-07-18T20:03:06.400Z pid=9479 tid=70b INFO: Sidekiq 7.1.2 connecting to Redis with options {:size=>10, :pool_name=>"internal", :url=>nil}
#<struct OpenTelemetry::SDK::Trace::SpanData
 name="default send",
 kind=:producer,
 status=#<OpenTelemetry::Trace::Status:0x000000010565d5f0 @code=1, @description="">,
 parent_span_id="\x00\x00\x00\x00\x00\x00\x00\x00",
 total_recorded_attributes=5,
 total_recorded_events=1,
 total_recorded_links=0,
 start_timestamp=1689710586400870000,
 end_timestamp=1689710586400903000,
 attributes=
  {"messaging.system"=>"sidekiq",
   "messaging.sidekiq.job_class"=>"SimpleJob",
   "messaging.message_id"=>"c1634515acddda2474fb13bd",
   "messaging.destination"=>"default",
   "messaging.destination_kind"=>"queue"},
 links=nil,
 events=[#<struct OpenTelemetry::SDK::Trace::Event name="created_at", attributes={}, timestamp=1689710586400817871>],
 resource=
  #<OpenTelemetry::SDK::Resources::Resource:0x0000000105753ec8
   @attributes=
    {"service.name"=>"unknown_service",
     "process.pid"=>9479,
     "process.command"=>"sidekiq_example.rb",
     "process.runtime.name"=>"ruby",
     "process.runtime.version"=>"3.2.2",
     "process.runtime.description"=>"ruby 3.2.2 (2023-03-30 revision e51014f9c0) [arm64-darwin22]",
     "telemetry.sdk.name"=>"opentelemetry",
     "telemetry.sdk.language"=>"ruby",
     "telemetry.sdk.version"=>"1.2.1"}>,
 instrumentation_scope=#<struct OpenTelemetry::SDK::InstrumentationScope name="OpenTelemetry::Instrumentation::Sidekiq", version="0.24.1">,
 span_id="\xACg+\x84\xBF\x06\xD7H",
 trace_id="`uV4l\xFB\x8B\x04\x04=R\x03\x9C+(U",
 trace_flags=#<OpenTelemetry::Trace::TraceFlags:0x00000001054f9970 @flags=1>,
 tracestate=#<OpenTelemetry::Trace::Tracestate:0x00000001054f0cd0 @hash={}>>

2. Run the sidekiq server

bundle exec sidekiq -r ./sidekiq_example.rb

Output

I, [2023-07-18T16:04:24.659634 #9874]  INFO -- : Instrumentation: OpenTelemetry::Instrumentation::Sidekiq was successfully installed with the following options {:span_naming=>:queue, :propagation_style=>:link, :trace_launcher_heartbeat=>false, :trace_poller_enqueue=>false, :trace_poller_wait=>false, :trace_processor_process_one=>false, :peer_service=>nil}


               m,
               `$b
          .ss,  $$:         .,d$
          `$$P,d$P'    .,md$P"'
           ,$$$$$b/md$$$P^'
         .d$$$$$$/$$$P'
         $$^' `"/$$$'       ____  _     _      _    _
         $:    ',$$:       / ___|(_) __| | ___| | _(_) __ _
         `b     :$$        \___ \| |/ _` |/ _ \ |/ / |/ _` |
                $$:         ___) | | (_| |  __/   <| | (_| |
                $$         |____/|_|\__,_|\___|_|\_\_|\__, |
              .d$$                                       |_|


2023-07-18T20:04:24.659Z pid=9874 tid=97q INFO: Running in ruby 3.2.2 (2023-03-30 revision e51014f9c0) [arm64-darwin22]
2023-07-18T20:04:24.659Z pid=9874 tid=97q INFO: See LICENSE and the LGPL-3.0 for licensing details.
2023-07-18T20:04:24.659Z pid=9874 tid=97q INFO: Upgrade to Sidekiq Pro for more features and support: https://sidekiq.org
2023-07-18T20:04:24.659Z pid=9874 tid=97q INFO: Sidekiq 7.1.2 connecting to Redis with options {:size=>10, :pool_name=>"internal", :url=>nil}
2023-07-18T20:04:24.665Z pid=9874 tid=97q INFO: Sidekiq 7.1.2 connecting to Redis with options {:size=>5, :pool_name=>"default", :url=>nil}
2023-07-18T20:04:24.665Z pid=9874 tid=97q INFO: Starting processing, hit Ctrl-C to stop
2023-07-18T20:04:24.671Z pid=9874 tid=966 class=SimpleJob jid=c1634515acddda2474fb13bd INFO: start
Workin'
#<struct OpenTelemetry::SDK::Trace::SpanData
 name="default process",
 kind=:consumer,
 status=#<OpenTelemetry::Trace::Status:0x0000000103945c10 @code=1, @description="">,
 parent_span_id="\x00\x00\x00\x00\x00\x00\x00\x00",
 total_recorded_attributes=6,
 total_recorded_events=2,
 total_recorded_links=1,
 start_timestamp=1689710664671493000,
 end_timestamp=1689710664671540000,
 attributes=
  {"messaging.system"=>"sidekiq",
   "messaging.sidekiq.job_class"=>"SimpleJob",
   "messaging.message_id"=>"c1634515acddda2474fb13bd",
   "messaging.destination"=>"default",
   "messaging.destination_kind"=>"queue",
   "messaging.operation"=>"process"},
 links=
  [#<OpenTelemetry::Trace::Link:0x00000001036328e8
    @attributes={},
    @span_context=
     #<OpenTelemetry::Trace::SpanContext:0x000000010355c310
      @remote=true,
      @span_id="\xACg+\x84\xBF\x06\xD7H",
      @trace_flags=#<OpenTelemetry::Trace::TraceFlags:0x0000000103633590 @flags=1>,
      @trace_id="`uV4l\xFB\x8B\x04\x04=R\x03\x9C+(U",
      @tracestate=#<OpenTelemetry::Trace::Tracestate:0x000000010326b070 @hash={}>>>],
 events=
  [#<struct OpenTelemetry::SDK::Trace::Event name="created_at", attributes={}, timestamp=1689710586400817871>,
   #<struct OpenTelemetry::SDK::Trace::Event name="enqueued_at", attributes={}, timestamp=1689710586403222084>],
 resource=
  #<OpenTelemetry::SDK::Resources::Resource:0x0000000103656bd0
   @attributes=
    {"service.name"=>"unknown_service",
     "process.pid"=>9874,
     "process.command"=>"/Users/jdeff/.asdf/installs/ruby/3.2.2/bin/sidekiq",
     "process.runtime.name"=>"ruby",
     "process.runtime.version"=>"3.2.2",
     "process.runtime.description"=>"ruby 3.2.2 (2023-03-30 revision e51014f9c0) [arm64-darwin22]",
     "telemetry.sdk.name"=>"opentelemetry",
     "telemetry.sdk.language"=>"ruby",
     "telemetry.sdk.version"=>"1.2.1"}>,
 instrumentation_scope=#<struct OpenTelemetry::SDK::InstrumentationScope name="OpenTelemetry::Instrumentation::Sidekiq", version="0.24.1">,
 span_id="d\x0E\xCC\xC1\x10\xB3\x84\x90",
 trace_id="b2\b\xA0\xCC\xF8\xC9\x93%\xDA\xBA\xEA\x803\x04p",
 trace_flags=#<OpenTelemetry::Trace::TraceFlags:0x00000001032ca890 @flags=1>,
 tracestate=#<OpenTelemetry::Trace::Tracestate:0x000000010326b070 @hash={}>>
W, [2023-07-18T16:04:24.678488 #9874]  WARN -- : Calling finish on an ended Span.
2023-07-18T20:04:24.678Z pid=9874 tid=966 class=SimpleJob jid=c1634515acddda2474fb13bd elapsed=0.008 INFO: done

We have a similar issue opened in the instrumentation repo: open-telemetry/opentelemetry-ruby-contrib#544

Would you mind sharing your comments there instead?

I can confirm this issue is still present even on opentelemetry-sdk (1.3.0)

W, [2023-08-14T17:40:18.014782 #1]  WARN -- : Calling set_attribute on an ended Span.
W, [2023-08-14T17:40:18.014860 #1]  WARN -- : Calling finish on an ended Span.
W, [2023-08-14T17:40:18.017897 #1]  WARN -- : Calling finish on an ended Span.
W, [2023-08-14T17:40:18.061842 #1]  WARN -- : Calling finish on an ended Span.
W, [2023-08-14T17:40:18.064870 #1]  WARN -- : Calling finish on an ended Span.
W, [2023-08-14T17:40:18.065509 #1]  WARN -- : Calling finish on an ended Span.
W, [2023-08-14T17:40:18.066852 #1]  WARN -- : Calling finish on an ended Span.
W, [2023-08-14T17:40:18.067042 #1]  WARN -- : Calling finish on an ended Span.
W, [2023-08-14T17:40:18.067125 #1]  WARN -- : Calling finish on an ended Span.

👋 This issue has been marked as stale because it has been open with no activity. You can: comment on the issue or remove the stale label to hold stale off for a while, add the keep label to hold stale off permanently, or do nothing. If you do nothing this issue will be closed eventually by the stale bot.