Multiple tracers in one schema are ignored
brendo opened this issue ยท 15 comments
Describe the bug
This bug almost appears to be a duplicate of #4542 but this time around it's with the ActiveSupport and OpenTelemetry tracers.
module Graph
class Schema < GraphQL::Schema
...
trace_with GraphQL::Tracing::ActiveSupportNotificationsTrace end
end
This schema, with the default OpenTelemetry
setup, sees the OpenTelemetry tracer completely ignored as spans are not included in traces yet metrics are emitted.
If I mix and match the old tracer API, I'm able to get both working:
module Graph
class Schema < GraphQL::Schema
...
tracer GraphQL::Tracing::ActiveSupportNotificationsTracing
end
end
Versions
graphql
: 2.3.10
opentelemetry-instrumentation-graphql
: 0.28.2
rails
: 7.1.3.4
Steps to reproduce
The test case demonstrates the opposite to the above due to how our app is built, but I think it's still sufficient to show the bug.
# frozen_string_literal: true
require "bundler/inline"
gemfile do
gem "graphql", "2.3.10"
gem 'opentelemetry-api'
gem 'opentelemetry-common'
gem 'opentelemetry-instrumentation-graphql'
gem 'opentelemetry-sdk'
gem 'rails'
end
module MultipleTracer
class Thing < GraphQL::Schema::Object
field :str, String
def str; "blah"; end
end
class Named < GraphQL::Schema::Union
possible_types Thing
def self.resolve_type(obj, ctx)
Thing
end
end
class Query < GraphQL::Schema::Object
include GraphQL::Types::Relay::HasNodeField
field :int, Integer, null: false
def int
1
end
field :thing, Thing
def thing; :thing; end
field :named, Named, resolver_method: :thing
end
class TestSchema < GraphQL::Schema
query(Query)
trace_with(GraphQL::Tracing::ActiveSupportNotificationsTrace)
end
end
# Comment from here....
ENV['OTEL_TRACES_EXPORTER'] = 'console'
OpenTelemetry::SDK.configure do |c|
c.use_all('OpenTelemetry::Instrumentation::GraphQL' => { schemas: [MultipleTracer::TestSchema] })
end
# ...to here to demonstrate bug
ActiveSupport::Notifications.monotonic_subscribe("execute_query.graphql") do |name, start, finish, _message_id, data|
puts "#{name} handled"
puts "#{data.inspect}"
end
query = "{ int thing { str } named { ... on Thing { str } } }"
MultipleTracer::TestSchema.execute(query)
The console will be messy, but if you search for execute_query.graphql handled
it will be absent. If these lines are commented out, and the command is re-run, you'll see the ActiveSupport handled messages.
ENV['OTEL_TRACES_EXPORTER'] = 'console'
OpenTelemetry::SDK.configure do |c|
c.use_all('OpenTelemetry::Instrumentation::GraphQL' => { schemas: [MultipleTracer::TestSchema] })
end
$ ruby inline.rb
execute_query.graphql handled
{:query=>query ...}
Expected behavior
Both tracers should be able to emit information about the GraphQL operations. So with the above example there should be a wall of trace info, as well as the two littles lines added by the ActiveSupport subscriber.
Actual behavior
The span information is emitted, but nothing from the ActiveSupport subscriber.
Hey, thanks for the detailed report and sorry for the trouble! Yes, these should definitely work well together.
I think the problem is that these methods in the OpenTelementry trace module are missing super
calls (which would call other trace modules): https://github.com/open-telemetry/opentelemetry-ruby-contrib/blob/35ddf66f9c42a228298ef1c9f333abb65d6ddc45/instrumentation/graphql/lib/opentelemetry/instrumentation/graphql/tracers/graphql_trace.rb#L91-L110
I noticed that other methods there did include super
calls, and that your test case was on a top-level instrumentation hook, which made me suspect this. I can take a look soon at adding those, or feel free to add them yourself if you get time.
Nope, it's more complicated than that ... I've got a failing test over at #5045, I have to massage the inheritance code to make it work ๐
Well, maybe it's both. Looking at your example, I realized its different than the one I encountered in the OpenTelementry test suite. My initial suggestion, adding super
calls, actually did make your script pass for me. So I opened a PR with those changes: open-telemetry/opentelemetry-ruby-contrib#1090
However, the tests aren't passing because of the problem identified in #5045 ... Nevermind, I think the weirdness is in the test suite there, because of how schema classes are reset and re-instrumented for each test.
Thanks so much for the speedy response! I'll try to test that branch of open-telemetry to confirm it's working today :)
I retested today with the newly released opentelemetry-instrumentation-graphql (0.28.4)
and unfortunately I'm still able to reproduce the bug in my application, but not the test case above. Will have to dig and see what's the difference ๐ค
Do you pass schemas: [ ... ]
in your app's installation config? If not, I suspect wonkiness there... (Without schemas: [ ... ]
, the OpenTelemetry stuff is added to GraphQL::Schema
, but that might not be handled well through inheritance, which I stumbed into in #5045)
Do you pass
schemas: [ ... ]
in your app's installation config?
We don't actually. Nice pickup. Let me find out how I can set that in our service (the configuration of OpenTelemetry is done via another gem by default).
๐ did you find any clues about this? I'd love to get it working properly...
Hey! Sorry, I thought I posted back. We experimented with setting the schemas via the OTEL_RUBY_INSTRUMENTATION_GRAPHQL_CONFIG_OPTS
variable but this didn't quite work because the tracer installers expect the option to be an instance of the schema, not a string:
I attempted some hacking to use const_get
/constantize
but this failed with some autoloading errors (schema hadn't been discovered yet in the boot process). I haven't had a chance to get back into looking at alternative options yet and at the moment are "making it work" by mixing the deprecated and new trace APIs.
Ok, thanks for the update. I tinkered with the original script above to try to get it to fail, but I couldn't...
I also reopened #5045 in case that's the cause, and I pushed a fix for the issue spotted there. Could you try your app with that patch?
Oops, that issue autoclosed this one. It might fix it -- if you get a chance to try out master now that it's been merged, let me know what you find.
Oops, that issue autoclosed this one. It might fix it -- if you get a chance to try out master now that it's been merged, let me know what you find.
Sorry, a retest confirms the bug. Switching from tracer GraphQL::Tracing::ActiveSupportNotificationsTracing, silence_deprecation_warning: true
to trace_with GraphQL::Tracing::ActiveSupportNotificationsTrace
causes the trace to lose the Otel spans.
I don't know if this helps at all, but here's a screenshot from a debug which shows both modules appear to be added:
I'm missing a lot of context, but is the problem that both modules (OpenTelemetry::Instrumentation::GraphQL::Tracers::GraphQLTrace
and GraphQL::Tracing::ActiveSupportNotificationsTrace
) define the same method names and because the AS:N
is added last, it overrides the OTel module?
eg. analyze_query
:
Ahhh yes! I hadn't looked at the AS::N trace yet, but I guess that's the issue. It doesn't call super
, either. I was able to mix up the script to demonstrate that bug:
# frozen_string_literal: true
require "bundler/inline"
gemfile(true) do
gem "graphql", "2.3.10" # path: "./"
gem 'opentelemetry-api'
gem 'opentelemetry-common'
gem 'opentelemetry-instrumentation-graphql'
gem 'opentelemetry-sdk'
gem 'rails'
end
ENV['OTEL_TRACES_EXPORTER'] = 'console'
OpenTelemetry::SDK.configure do |c|
c.use('OpenTelemetry::Instrumentation::GraphQL')
end
module MultipleTracer
class Thing < GraphQL::Schema::Object
field :str, String
def str; "blah"; end
end
class Named < GraphQL::Schema::Union
possible_types Thing
def self.resolve_type(obj, ctx)
Thing
end
end
class Query < GraphQL::Schema::Object
include GraphQL::Types::Relay::HasNodeField
field :int, Integer, null: false
def int
1
end
field :thing, Thing
def thing; :thing; end
field :named, Named, resolver_method: :thing
end
class TestSchema < GraphQL::Schema
query(Query)
trace_with(GraphQL::Tracing::ActiveSupportNotificationsTrace)
end
end
ActiveSupport::Notifications.monotonic_subscribe("execute_query.graphql") do |name, start, finish, _message_id, data|
puts "#{name} handled"
puts "#{data.inspect}"
# raise "Horray, it worked"
end
query = "{ int thing { str } named { ... on Thing { str } } }"
MultipleTracer::TestSchema.execute(query)
I'll work up a patch for it.
If you get a chance to try master again, please let me know what you find ๐ Thanks again for your help tracking this down.
That did it, thank you! Really appreciate the back and forth and patience to get this one solved :)