Incoming requests with "not sampled" `traceparent` headers are not reported to AppSignal
unflxw opened this issue · 1 comments
Intercom conversation: https://app.intercom.com/a/inbox/yzor8gyw/inbox/admin/5246522/conversation/16410700259808
Original (wrong) explanation and thoughts
When a request from a third party carries over OpenTelemetry traceparent
and tracestate
headers, OpenTelemetry will automatically read those headers, using them to set the parent of the trace that will be generated to a span (on the third party's end) that the AppSignal agent will never receive.
The AppSignal agent, in turn, will never flush a trace whose root span is not known to it (unless that trace contains a span that is marked as a root span candidate)
A possible solution for this issue, then, would be to mark spans at the root of an HTTP request being handled (WSGI, Flask, and some (?) of the spans generated by ASGI, FastAPI, Starlette) as root span candidates.
A more general solution would be, given that OpenTelemetry enforces that what is sent is always the "whole sub-trace" as far as the OpenTelemetry-instrumented application is concerned, to treat the "root-most" span received in a trace exporter request as the root span -- even if it references an unknown parent. (Note that this is not compatible with the way that AppSignal for Node.js currently flushes OpenTelemetry spans to the extension "one at a time" -- but this could be addressed on that end)
When a request from a third party carries over OpenTelemetry traceparent
and tracestate
headers, OpenTelemetry will automatically read those headers, using them to set the parent of the trace that will be generated to a span that this OpenTelemetry-instrumented application will never receive.
If the traceparent
ends in -00
, meaning "not sampled", this decision to not sample the request from the third party's OpenTelemetry stack will carry over to our stack, and OpenTelemetry will not sample this request.
Folks on the CNCF Slack have helpfully suggested a workaround in the form of a WSGI middleware, as well as using the ParentBased
sampler decorator to force OpenTelemetry to sample traces with the "not sampled" flag. We may want to make the latter part of our default configuration, both here and in the Node.js instrumentation.