Failed to report profiling data, connection closed before message completed

Question

Failed to report profiling data, connection closed before message completed

Opened this issue 5 months ago · 3 comments

Current behaviour
Intermittent errors sending traces, unsure how to debug further

E, [2024-06-13T08:29:54.066769 #28] ERROR -- ddtrace: [ddtrace] (/usr/local/bundle/gems/ddtrace-1.22.0/lib/datadog/profiling/http_transport.rb:62:in `export') Failed to report profiling data ({:agent=>"unix:///var/run/datadog/apm.socket"}): failed ddog_prof_Exporter_send: connection closed before message completed

Expected behaviour
no error sending trace

Steps to reproduce
Unsure, happens randomly across cluster and different pods

How does datadog help you?

Environment

datadog version: 1.7.45
Configuration block (Datadog.configure ...):

Datadog.configure do |c|
  c.tracing.instrument(:sneakers, service_name: 'workers')
  c.tracing.instrument(:active_record, describes: :primary, service_name: 'postgres')

  c.tracing.instrument(:delayed_job, service_name: 'workers')
  c.tracing.instrument(:active_job, service_name: 'workers')
end

Ruby version: 3.2.4
Operating system: Kube pod for docker image: ruby:3.2.4-slim-bullseye
Relevant library versions:

ddtrace 1.22.0 (also tried 1.23.1)
Using volume mounted socket, most traces are getting through
ddagent image: gcr.io/datadoghq/agent:7.48.1
env vars:
DD_STATSD_SOCKET_PATH : /var/run/datadog/dsd.socket
DD_TRACE_AGENT_URL : unix:///var/run/datadog/apm.socket
DD_VERSION :1.7.45

Answer 1 · 2024-06-13T11:23:21.000Z

Hey @ganey, thanks for getting in touch.

The error message you shared is actually related to profiling, not sending of traces...

E, [2024-06-13T08:29:54.066769 #28] ERROR -- ddtrace: [ddtrace] (/usr/local/bundle/gems/ddtrace-1.22.0/lib/datadog/profiling/http_transport.rb:62:in `export') Failed to report profiling data ({:agent=>"unix:///var/run/datadog/apm.socket"}): failed ddog_prof_Exporter_send: connection closed before message completed

...so it's possible your traces are not getting affected at all.

One thing that's a bit confusing is that your configuration block and environment variables do not mention enabling profiling at all. Can you doublecheck if there's missing environment variables and/or configuration that's enabling profiling?

Answer 2 · 2024-06-17T10:16:14.000Z

Hey @ivoanjo thanks for the quick response.

So i dug further into the code and one of the gems we're using is turning on profiling, here's the configure we're using from that gem:

Datadog.configure do |c|
  container_name = ENV.fetch('CONTAINER_NAME') { '' }
  pod_name = ENV.fetch('POD_NAME') { '' }
  global_tags = [
    "railsenv:#{Rails.env}",
    "service:workers",
    "container_name:#{container_name}",
    "pod_name:#{pod_name}",
  ]
  
  c.runtime_metrics.enabled = true
  
  datadog_singleton = DatadogSingleton.instance
  datadog_statsd_socket_path = ENV.fetch('DD_STATSD_SOCKET_PATH') { '' }
  datadog_singleton.statsd = if datadog_statsd_socket_path.to_s.strip.empty?
    Datadog::Statsd.new(ENV['DD_AGENT_HOST'], 8125, tags: global_tags)
  else
    Datadog::Statsd.new(socket_path: datadog_statsd_socket_path, tags: global_tags)
  end
  c.runtime_metrics.statsd = datadog_singleton.statsd
  
  # Trace tags API is Hash<String,String>, see https://www.rubydoc.info/gems/ddtrace/Datadog/Tracing
  # Should match the global tags, but as a Hash.
  c.tags = {
    railsenv: Rails.env,
    service: 'workers',
    container_name: container_name,
    pod_name: pod_name,
  }
  
  c.tracing.enabled = true
  c.profiling.enabled = true
  
  c.tracing.instrument(:rails, service_name: 'workers')
  
  c.logger.level = Logger::WARN
end

Answer 3 · 2024-06-18T13:49:18.000Z

Ahh, thanks, this is definitely a more through configuration.

Can I ask you to open a support ticket so we can look at your account and investigate this?

Could you include in the ticket:

A mention the problem seems to be triggered by profiling and a link to this github issue
Are you still receiving traces and profiles for your app? Could you include a link to them so we can look into it?
If possible, can you also include the log line that gets printed by the library when started? You should see one starting with DATADOG CONFIGURATION - CORE and another with DATADOG CONFIGURATION - TRACING, it would be great if you could include both.

Thank you!