Build a Span with a given SpanContext
felixbarny opened this issue · 37 comments
Hi folks,
in Java terms I'd like to be be able to do the following:
SpanContext context = getSpanContext(request);
tracer.buildSpan("operationName")
.withContext(context)
.start();
-> Introduce SpanBuilder#withContext(SpanContext)
Right now, its only possible to create a child span given a SpanContext
. I want to be able to create a span with a specific SpanContext
.
The use case I have in mind is when a Span
is created via JavaScript and then reported to a Java backend. In this backend, I want to recreate the span created in JavaScript, possibly enhance it with some tags (like adding the geo location based on the client ip, parsing of the user agent) and then report it. This is currently not possible in an implementation independent way.
Another use case might be ETL when you want to migrate traces from a legacy system to a new backend. In this case you probably also want to read traces from the old backend, transform them to a canonical format (Span
) and then convert them to the target format.
@felixbarny I can't seem to find that issue, but we discussed in the past a different span reference type like "decorates", i.e. if you have a span context for span A and you create span B with Reference(ctx=spanCtxForSpanA, refType="decorates")
, then the backend may merge the tags/logs from B into A on ingestion or display (or use some sort of different visualization). In this approach we don't need to change the builder interface.
then the backend may merge the tags/logs from B into A on ingestion or display (or use some sort of different visualization)
Adding a reference does not seem to be quite the the semantics I'm aiming for. How is refType = decorates implemented in Jaeger Java?
If there are magic values for reference types, we should probably document them in https://github.com/opentracing/specification/blob/master/specification.md#references-between-spans.
Adding a reference does not seem to be quite the the semantics I'm aiming for.
Well, what seemed odd in your approach is that you're saying you have a Javascript layer that is somehow able to generate an appropriate SpanContext on the wire, which ideally should only be done with an OpenTracing tracer implementation. If you already have such implementation in Javascript, then it should be able to generate the Span in the respective backend format instead of relying on an extra Java server to do the same. The Java server may still want to enrich the span with additional tags, which is what the decorate
reference allows.
How is refType = decorates implemented in Jaeger Java?
It's not yet, as I said it was just one of the suggestions on some github issue, which, sadly, I cannot find, perhaps it wasn't in the specification repo. @tedsuo do you remember?
If there are magic values for reference types, we should probably document
Definitely, once there's an agreement.
generate an appropriate SpanContext on the wire, which ideally should only be done with an OpenTracing tracer implementation
The way I see it every tracing system with uses the same approach for context propagation can create a span context.
In this instance, I'm using Instana's open source end user monitoring library Weasel.
If you already have such implementation in Javascript, then it should be able to generate the Span in the respective backend format instead of relying on an extra Java server to do the same.
Weasel does not create spans in the specific span format I need. In fact, that's also not desirable as you want a much more compressed format when sending beacons back to the server. A weasel beacon might look like this:
http://localhost:9966/petclinic/stagemonitor/public/eum?r=1500460599181&ts=2109&d=2115&ty=xhr&pl=848fad1487a95476&l=http%3A%2F%2Flocalhost%3A9966%2Fpetclinic%2F&m=GET&u=owners.html%3FlastName%3D&a=1&st=200&t=982dc17f92d3ee7e&s=982dc17f92d3ee7e
Where the parameters t
and s
stand for Trace-Id and Span-Id respectively.
Weasel sends this beacon after the ajax request has completed. It also modifies the XMLHttpRequest so that the B3 Headers are propagated.
The Java server may still want to enrich the span with additional tags, which is what the decorate reference allows.
The Java server does not decorate an existing span but rather creates a span from a beacon with a given span context (t
and s
parameters).
I don't think the general principle is specific to my stagemonitor-weasel beacon-span-creating use case. As I mentioned it's just one example of a much broader use case called ETL.
@felixbarny note that there is a significant difference in creating a new span that decorates another span, and re-creating another instance of the span with the existing span ID. The latter may not even be supported by some tracing systems as it means multiple spans with the same ID.
FWIW I also have a use case for the 'decoration' of existing spans. I'm trying to add log fields from a Log4j-based logging system, but the log appender runs asynchronously meaning that by the time the log entry is processed, the span which it corresponds to has already been sent out. Being able to add additional log fields/tags for a known span ID would solve this problem.
@felixbarny note that there is a significant difference in creating a new span that decorates another span, and re-creating another instance of the span with the existing span ID.
I agree.
The latter may not even be supported by some tracing systems as it means multiple spans with the same ID.
This issue is about adding support for that on the API level. Or do you mean it is technically impossible for some tracer implementations? If so, why?
the log appender runs asynchronously meaning that by the time the log entry is processed, the span which it corresponds to has already been sent out
You are trying to implement a log appender which adds logs to the active span? Interesting! Have you tried using a non-async logger for that? While a very interesting one, I don't think this is the use case I was trying to describe in this issue.
You are trying to implement a log appender which adds logs to the active span? Interesting! Have you tried using a non-async logger for that?
Yep - 'active' as in active at the time the log method was called by the user. Really I'm just trying to replicate the zap functionality but with Log4j. A synchronous log appender would work but is not really an option for performance reasons.
@felixbarny Yeah it's a good point - I wasn't sure how much log processing would be in the critical path e.g. variable processing, but I think it looks fine, so I'll go ahead with a synchronous appender. Thanks for the help!
This issue is about adding support for that on the API level. Or do you mean it is technically impossible for some tracer implementations? If so, why?
@felixbarny I cannot answer the question about all implementations, that's why adding something like that to the API should be discussed more broadly.
I think the "span enrichment" use case that @mikewrighton has is easier to solve, with a new span reference type. The ETL scenario is not a very clean use case in the first place, because the assumption is that you want to transform something that is not produced by an OpenTracing implementation into a Span while remaining vendor-agnostic in the transformation. But how can you be vendor neutral if you already depend on the specific format of the SpanContext? You are making the assumption that your JS source generates the SpanContext in the exact same way as the backend tracer, which implies tight vendor coupling. Why not then just create the Span manually using vendor-specific API, like new com.uber.jaerger.Span(...)
?
I am not trying to dismiss your use case, I think it is indeed an important one to support, but I can see it getting easier to support once we are able to standardize on things like SpanContext wire representation (e.g. https://github.com/TraceContext/tracecontext-spec), and possibly vendor-neutral Span representation (e.g. #64).
that's why adding something like that to the API should be discussed more broadly.
Before talking about if its possible to implement we should agree if it makes sense to add the method. What is you opinion on that?
I cannot answer the question about all implementations
What about Jaeger, would it be possible to add it there?
But how can you be vendor neutral if you already depend on the specific format of the SpanContext? You are making the assumption that your JS source generates the SpanContext in the exact same way as the backend tracer, which implies tight vendor coupling.
The only assumption I make is that the source (JavaScript) produces a B3 compatible SpanContext. I don't see a vendor coupling here.
Why not then just create the Span manually using vendor-specific API, like new com.uber.jaerger.Span(...)?
Because I want to be able to use different OT Impls.
The only assumption I make is that the source (JavaScript) produces a B3 compatible SpanContext.
But in order to do .withContext(context)
you still need to create a vendor-specific SpanContext first.
Before talking about if its possible to implement we should agree if it makes sense to add the method. What is you opinion on that?
In theory it makes sense to me, but I think the standard for wire format is a pre-requisite, otherwise I still don't see how you can make your code completely vendor neutral.
What about Jaeger, would it be possible to add it there?
The simple ETL case wouldn't break Jaeger, because it still means each span will have a unique span ID. The enrichment case other people asked for, such as adding tags/logs to the existing span via another instance with the same span ID, will probably cause an issue. Jaeger backend does allow multiple spans with the same ID, but that was only done to support Zipkin's shared RPC span model and those client-server spans are actually deduped before returning to the UI. Additionally, we had an internal process where we parse haproxy logs and store them by writing another span with the same ID. To deal with that the resulting dups we have a pre-processor in the query service that finds those haproxy spans and merges them into the real span by essentially discarding everything except logs. A general solution for multiple spans with the same ID requires certain merging rules, like in Zipkin where it does min(startTime), max(duration), union(logs), and then somewhat funky logic about deciding the operation name (certain precedence order). By going with a decorator reference type we avoid the need for the merging.
But in order to do .withContext(context) you still need to create a vendor-specific SpanContext first.
Yes, but it is independent form the OT Impl: I can create the context with Tracer#extract
and a custom Format
(aka extractor) which reads the weasel specific URL parameters from the HTTP request.
Sure enough, I have to write an extractor which is specific to Weasel but it is agnostic of the underlying OT-Impl.
Sure enough, I have to write an extractor which is specific to Weasel but it is agnostic of the underlying OT-Impl.
Yes, assuming that you only deal with OT impls that understand a common wire format like B3
@bhs could you have a look at this?
I think we agreed that this is a reasonable change but need feedback from other trace implementors.
@felixbarny so sorry for the delay – crazy times in my inbox!!
TL;DR, if we do this, I would prefer something like a new "SELF" Reference
type... something like
public final class References {
...
/**
* The {@link SELF} reference class can be used to construct a {@link Span} instance with a specific {@link SpanContext}.
*/
public static final String SELF = "self";
}
Adding a new reference type feels less significant than adding a new method to an API (and it's also not a breaking change FWIW, though that's not my primary motivation for suggesting this approach).
Whether all Tracers will Do The Right Thing with this paradigm is unlikely. I think I see the value in making it possible to express these semantics, though. Hope this helps?
That's a good suggestion! Should I make a pull request for that?
@felixbarny I am personally comfortable with it, though @opentracing/otsc may have other opinions.
Sorry again for the delay, I am barely at a keyboard during the week these days.
I am fine with SELF, I actually thought I suggested it earlier myself, but may have been just thinking about it. It's definitely preferable to a change in the API. And I would prefer it to be introduced as an incubating feature (similar to the service
tag proposal).
I would prefer it to be introduced as an incubating feature
👍
That would make this a no-brainer for me. I see upside but want to be cautious.
The specification repo is plain text, we don't need any markets. In the APIs the constant comments should be sufficient.
@tedsuo had some thoughts about adding a column for maturity... Incubating | Stable | Deprecated
IIRC
The PR opentracing/opentracing-java#212 revealed an issue with the SELF reference - one cannot create a proper parent reference unless the parent ID is propagated on the wire, which many systems do not do, and it's not considered as part of https://github.com/TraceContext/tracecontext-spec
By the way, I just thought I invented a request for a self-reference, then @felixbarny pointed me to this which was asked for a long time ago.
If you allow a self-reference, people can report spans post-factum. The implementation detail of creating a span context can be vendor specific as necessary, but lacking this feature there's no way to report post factum unless you are doing it in event order.
For example @mojsha requested that zipkin instrumentation be able to report to an opentracing tracer. I can't even do this with jaeger-specific code without package issue for lack of a self-reference feature.
This artificially limits the ways people can integrate data, as if there was some way even jaeger-specific, you could unlock a huge amount of projects to being able to participate at least from a recording perspective without pinning their core deps.
public class ZipkinReporterAdapter implements zipkin2.reporter.Reporter<zipkin2.Span> {
com.uber.jaeger.Tracer tracer;
@Override public void report(zipkin2.Span zipkinSpan) {
// sneaky access needed because there is no way to instantiate a jaeger span
// given only a trace context.
//
// For example, no Tracer.toSpan(context) or self-reference in either OpenTracing or Jaeger api
com.uber.jaeger.Span uberSpan = Access.newSpan(
tracer,
zipkinSpan.name(),
context(zipkinSpan),
zipkinSpan.timestamp(),
new LinkedHashMap(zipkinSpan.tags())
);
// TODO make Opentracing local tags out of zipkinSpan.localEndpoint()
// TODO make Opentracing peer.tags out of zipkinSpan.remoteEndpoint()
// TODO make Opentracing tag out of zipkinSpan.kind()
// TODO guard zero etc.
uberSpan.finish(zipkinSpan.timestampAsLong() + zipkinSpan.durationAsLong());
}
static com.uber.jaeger.SpanContext context(zipkin2.Span zipkinSpan) {
com.uber.jaeger.SpanContext result = new com.uber.jaeger.SpanContext(
fromHex(zipkinSpan.traceId()) // guard as zipkin trace ID can be 128bit
fromHex(zipkinSpan.parentId()) // guard null
...
)
}
}
by the way, in brave this is Tracer.toSpan(context)
which allows you to add anything. the self-reference thing could backport similar to span builder. The above code could work. If the problem is about who hold the parent id, as long as we can materialize that we're also good. Another option is to define a struct, reporter api or any other way
Does Tracer.toSpan(ctx) actually start the span, ie capture the start timestamp?
seems like we got stuck on a problem that wasn't that hard to solve opentracing/opentracing-java#212 (comment)
with that I'm +1 to introduce SELF reference (jaegertracing/jaeger#744)
@yurishkuro no toSpan is not lifecycle effecting. It just establishes the trace IDs to use in further recordings.
Following up here as well (Sorry for the delay). Seems like there is sustained interest in this, can someone champion it by creating an RFC and submitting a PR?
RFC Template: https://github.com/opentracing/specification/blob/master/rfc_template.md
RFC Process: https://github.com/opentracing/specification/blob/master/rfc_process.md
Thanks!
@tedsuo considering this issue was delayed so long, predating the existence of RFC process. How about if someone in OT who is routinely working on other things champion it on behalf of others?
@mabn maybe you have interest in this? Would you be up to an RFC or is using Jaeger types good enough for now?
The use case I have in mind is when a Span is created via JavaScript and then reported to a Java backend. And I want to create a child span under the given span context in java? Can you please help me to solve this use case.
I also have a requirement - an adapter which collects some start and end events from external systems. Such events have timestamps and normally start or stop some activity - you can think of them as start and stop span legs - if you combine two corresponding span legs, a span can be formed. The correlation is quite simple even for building an entire DAG of span relationship.
The problem is that, it is not possible to incrementally complete the spans (when both span legs arrived, report and forget them) because parent SpanContext
may not be available yet (as its span legs haven't been processed yet):
- We know some application-specific correlation id for future parent
SpanContext
from child legs and could possibly pre-create it (store under that correlation id) so that childSpanContext
can now become complete for trace reporting (and cleanup of memory). We would complete the parentSpan
by re-creating it from that parentSpanContext
later. - But if we pre-create parent
SpanContext
upfront (before its own span legs arrive), we won't be able to create itsSpan
object later from it - instead, everySpan
is created with newSpanContext
(with new generated span id) which breaks the trace.