DataDog APM not receiving data
layerssss opened this issue · 11 comments
Describe the bug
I was tracking an issue after upgrading ruby-graphql
the DataDog APM stopped receiving data. I found this issue occurred in version 2.1.11, as well as 2.2.24, 2.3.3. (working well in in <= 2.1.10)
I noticed when it was working well (2.2.10), I could get these log by DD_TRACE_DEBUG=true
Name: execute.graphql
Span ID: 306045165879513178
Parent ID: 1395647012708376529
Trace ID: 135933776842391537409026544268374695538
Type: custom
Service: ruby-graphql
Resource: execute.graphql
Error: 0
Start: 1715725477160225024
End: 1715725492656022016
Duration: 15.495798999909312
Tags: [
env => development,
component => graphql,
operation => execute_query,
selected_operation_name => cameraPreviewMotionCrop,
selected_operation_type => mutation,
query_string => mutation cameraPreviewMotionCrop($input: CameraPreviewMotionCropInput!) {
cameraPreviewMotionCrop(input: $input) {
jpgImageBase64
previewErrorLogs
__typename
}
}]
Metrics: [
],
But in 2.2.11 (when APM is no longer receiving data)
Name: execute.graphql
Span ID: 335850196281974193
Parent ID: 2492163108850977412
Trace ID: 135933765750448785412869899199825844147
Type: custom
Service: rails
Resource: execute.graphql
Error: 0
Start: 1715725337101981952
End: 1715725353642692096
Duration: 16.540711999870837
Tags: [
env => development,
component => graphql,
operation => execute_query,
selected_operation_name => cameraPreviewMotionCrop,
selected_operation_type => mutation,
query_string => mutation cameraPreviewMotionCrop($input: CameraPreviewMotionCropInput!) {
cameraPreviewMotionCrop(input: $input) {
jpgImageBase64
previewErrorLogs
__typename
}
}]
Metrics: [
],
I could see the difference in Service:
changed from ruby-graphql
to rails
, so wondered if this patch made in 2.1.10 had potentially broken it. @TonyCTHsu
Versions
graphql
version: 2.1.11- rails (7.1.3.2)
- ddtrace (1.23.0)
Steps to reproduce
- executing a GraphQL request
Expected behavior
Expect data entries to popup in DataDog ruby-graphql
APM.
Actual behavior
No data arrives in DataDog ruby-graphql
APM
Additional context
The way we configure DataDog:
...Gemfile
gem "ddtrace", require: "ddtrace/auto_instrument"
...config/initializers/datadog.rb
Datadog.configure do |c|
c.tracing.instrument :active_model_serializers
c.tracing.instrument :aws
c.tracing.instrument :excon
c.tracing.instrument :faraday
c.tracing.instrument :http
c.tracing.instrument :httpclient
c.tracing.instrument :rails, service_name: "rails"
c.tracing.instrument :redis
c.tracing.instrument :sidekiq, service_name: "sidekiq", client_service_name: "sidekiq-client"
end
...app/grahpql/some_application_schema.rb
class SomeApplicationSchema < GraphQL::Schema
use(GraphQL::Tracing::DataDogTracing)
...
We have a secondary GraphQL schema also with use(GraphQL::Tracing::DataDogTracing)
.
Switching to the new trace_with GraphQL::Tracing::DataDogTrace
after upgrading didn't solve the issue.
Switching to ddtrace
2.x (datadog
) didn't solve the issue either.
Hey @layerssss, thanks for reporting this issue.
I see you spotted the difference in the Service:
name. Were you able to find data in DataDog under the new service name? (I'm not sure how that's surfaced inside DataDog, but I'm wondering whether there's really no data going to DataDog, or whether data is still going there, but it's in a new place.)
Judging by the diff in that PR, it looks like you could set the Service back to ruby-graphql
by passing it as an option, for example:
trace_with GraphQL::Tracing::DataDogTrace, service: "ruby-graphql"
What happens if you add that option to your setup?
Hi @rmosolgo I think you are right, the data is sent to DataDog under the new service name rails
. But it didn't get processed because I assume the records won't "fit" into the "rails" APM. Here is a screenshot inside the DataDog rails APM. There are no additional entries related to GraphQL. (without specifying service:
option)
Adding service: "ruby-graphql"
does workaround the issue though. (The ruby-graphql
APM did get populated)
cc @TonyCTHsu @vpellan is this intended behavior? Should the GraphQL-Ruby plugin still be providing ruby-graphql
as the default service name?
👋 @layerssss @rmosolgo Thanks for reporting.
service
is a field defined to be a entity that groups together endpoints, queries, or jobs for the purposes of building your application.
Generally speaking, it is your application.
Historically, it was abused for other reasons. Assigning it incorrectly would break other features such as Service Catalog.
GraphQL should be considered as internal to your application without explicitly defining it as a different service other than your application. The service for GraphQL spans will be labelled as your service definition from your configuration.
Datadog.configure do |c|
c.service = "..."
end
I would highly recommend to NOT provide the default service ruby-graphql
, because eventually this field will be deprecated from Datadog's API.
@TonyCTHsu Thanks for referring to the documentation. But if I remove the service:
option. I could no longer find any stats for each executed GraphQL query in the APM section in DataDog.
It's not inside the rails
service as my above screenshot. Where should I look for it?
@TonyCTHsu I've tried setting a default service
option for the whole application.
Now the whole configuration becomes:
Datadog.configure do |c|
c.service = "another-rails-app"
c.tracing.contrib.global_default_service_name.enabled = true
if ENV["DD_ENV"].present?
c.tracing.instrument :active_model_serializers
c.tracing.instrument :active_support
c.tracing.instrument :aws
c.tracing.instrument :excon
c.tracing.instrument :faraday
c.tracing.instrument :http
c.tracing.instrument :httpclient
c.tracing.instrument :rails
c.tracing.instrument :redis
c.tracing.instrument :sidekiq
c.profiling.enabled = true if Rails.env.production?
else
c.tracing.enabled = false
end
end
graphql/???_schema.rb
has
trace_with GraphQL::Tracing::DataDogTrace
(we have 2 schemas)
I've also removed require: "ddtrace/auto_instrument"
from Gemfile
just in case.
This does result all integrations ended up nicely under the new "service" (another-rails-app
). But GraphQL seems to be the only one missing here.
When I changed trace_with GraphQL::Tracing::DataDogTrace
to trace_with GraphQL::Tracing::DataDogTrace, service: "another-graphql-app"
, GraphQL stats appeared (under the new service name).
Hey @layerssss, is there any information in the execute.graphql
span that you cannot get from the rack.request
, scoped to your Rails controller that handles GraphQL requests in your application?
In execute.graphql
(when it works), each resource is the name of GraphQL query (passed by operation_name
from ruby-graphql). Showing me execution span for each different query corresponding to different React component it was triggered from.
In rack.request
each resource is the name of the controller. Showing span of each different controllers. All GraphQL requests are within one controller GraphqlController
. e.g. I won't be able to tell which React component is triggering a slow GraphQL query.
@layerssss, glad to hear that using service: ...
makes the data show up again.
@marcotc or @TonyCTHsu, can either of you provide a screenshot of how GraphQL data should look in the DataDog UI? I want to make sure our default plugin makes data appear somewhere, because as @layerssss mentioned, it contains information that other spans don't have.
I don't know what else needs to happen in GraphQL-Ruby here, so I'll close this out.