Reference type to represent relationship between traces
Opened this issue · 21 comments
As discussed here, we may need a new relationship type similar to Follows From
, but that results in the Span initiating a new trace.
(cc @jmacd who cares about this... also @yurishkuro who brought this up in the opentracing.io PR cited above)
@objectiser thank you for bringing this up. I like the idea of creating a new reference type, though I think I'm going to argue that the semantics shouldn't require a new trace to be created. The allocation of something like a trace_id
is intentionally out of scope for the OT spec.
For batching specifically I think we would do well to create a new reference type. (BATCH_ELEMENT
? BATCHED_FROM
? BATCH_INCLUDE
?)
For general-purpose "large-scale" queueing, it's less clear to me. As a thought exercise, why not keep the reference type as FOLLOWS_FROM
and let Tracers use a tag (or similar) to determine whether a new trace_id (or whatever) should be allocated?
(Oh, and cc @JonathanMace who has thought a lot about this sort of thing)
@bhs I think we need to consider batching as a separate issue. But actually like the idea of just using Follows From
as essentially the relationship type is the same, regardless of whether internally a new trace is started.
So maybe the issue really is, whether the application needs to provide guidance to the tracer regarding whether potentially a new 'trace' should be started (e.g. via a tag), or is this really just an internal decision within the Tracer implementation? For example, in Java if using a framework integration, then how would that guidance be provided anyway?
I think we need to consider batching as a separate issue
(Agreed)
So maybe the issue really is, whether the application needs to provide guidance to the tracer regarding whether potentially a new 'trace' should be started (e.g. via a tag), or is this really just an internal decision within the Tracer implementation? For example, in Java if using a framework integration, then how would that guidance be provided anyway?
In my mind the question hinges on whether there are multiple parents. If it's just a linear sequence, I am personally comfortable with an unadorned FOLLOWS_FROM, though it would be interesting to hear from others about this. For fork/join behavior (or batching), not so much.
I guess we could have a tag to indicate that the parent-child relationship is not latency-sensitive (i.e., that we might be in a throughput-optimized (read: slow) situation). That could hint to the tracing system to allocate a new id. YMMV?
In my mind the question hinges on whether there are multiple parents. If it's just a linear sequence, I am personally comfortable with an unadorned FOLLOWS_FROM, though it would be interesting to hear from others about this. For fork/join behavior (or batching), not so much.
I think this may depend upon whether the multiple Follows From
references are related to the same 'trace id' - if so, then I think it would be acceptable for the join to continue with the same trace id - otherwise a new trace could be created to represent the join of multiple traces - but this is probably more of a tracer impl issue.
I guess we could have a tag to indicate that the parent-child relationship is not latency-sensitive (i.e., that we might be in a throughput-optimized (read: slow) situation). That could hint to the tracing system to allocate a new id. YMMV?
That sounds good to me - and would just be a configuration option to any framework integration for the message consumer.
In my discussions with our users we came across several scenarios where people want to use span reference to capture relationship to another trace. One extreme example: an Uber driver comes online and starts a session in the mobile app that can last 8 hours. The mobile team is interested in seeing that whole session as a single trace, and seeing all backend requests also traced but with different trace IDs, linked back to the "session trace".
The point is, developers sometimes want explicit control over starting a new trace vs. continuing the existing trace while preserving causality between spans. The current childOf / followsFrom reference types do not provide such control. Considering that these ref types express a different dimension, having a standard tag to force a new trace might be good.
Ok so if we have a new tag, e.g. 'subtrace' = 'true'
(better name required), then it can also be independent of the relationship types used - so could potentially also be used with a ChildOf
relationship.
@yurishkuro IMO the situation you described isn't something to be modeled with span references... I would imagine that each mobile span would be tagged with a driver_id
or session_id
or similar, and that the tracing system would be able to filter/group by such tags to display a "meta-trace". Thoughts?
@bhs tagging only works if you just want to find related traces, but our mobile team also wants to see a "trace" of user's interaction with the app.
@yurishkuro I guess I don't see it that way... rather than having each trace refer to the previous one, it seems more natural (at least to my brain) to have each trace take a "session tag" or something similar.
If we really want to represent this sort of thing with references, I guess I would argue for some sort of giant (i.e., long-duration) Span that models the entire session; then the smaller traces could have a special reference type to that "session Span". Maybe CONTINUES_SESSION
? YMMV. I still think the tag approach is more flexible.
@bhs I am not saying "refer to previous", but refer to "mega-trace" if you will. Labelling all individual traces with a session id tag is doable, but it's akin to trying to use general logging for tracing purposes - you loose the context. The mega-trace provides the context, and it's not just a single span, the intention is that it represents user's high level interactions with the app.
@yurishkuro I see. So I think the key differentiator here is that you are talking about an actual Span for the mega-trace in that there's a specific start time and a specific end time. My tagging approach might make sense for something like a specific user_id, but that's actually different as it is "infinite" in the time dimension.
How would you want to describe (in terms of naming) this relationship?
Ironically, I think most of my use cases can still be solved with childOf and followsFrom, provided there was an extra attribute somewhere that says "start a new trace (id)". It could be an optional flag on the SpanReference.
An optional tag to indicate the programmer's desire to start a trace makes some sense, in as much as it doesn't muddy OT semantics. I also agree that more relationship types are needed to express incidental contact between spans whether they are part of the same trace or not, such as "RESOURCE_CONTENTION" to indicate (bidirectionally) that two spans made contact.
-- deleted --
@yurishkuro Just wondering if that would be necessary, given the new asRoot
method being proposed in opentracing/opentracing-java#115 ? Doesn't this imply that, regardless of what references are subsequently added, the span should be created in a new trace?
@objectiser that sounds good, especially because the "new trace" flag needs to apply to the new span only once, not for every reference that's added to it, so my proposal above is incorrect in that detail.
So the new proposal (note the stronger wording):
add an optional Boolean attribute
root
to "start span" methods that forces the new span into a new trace. In cases when the API employs an "active span manager" paradigm, the new attribute also instructs the tracer not to establish a child-of relationship with the currently active span. Only relationships provided via explicitly passed References should be established.
@yurishkuro Not sure any new boolean attribute is required, as if the asRoot
method is used the Span
will automatically be the root of a new trace. Any subsequent references added will be inter-trace. Javadoc for asRoot
:
Remove any explicit (e.g., via {@link SpanBuilder#addReference(String,SpanContext)}) or implicit (e.g., via {@link ActiveSpanSource#activeContext()}) references to parent / predecessor SpanContexts, thus making the built Span a "root" of a Trace tree/graph.
asRoot()
is Java-specific, I was referring to a conceptual root: true
flag across all APIs.
Ah ok - I thought asRoot
may become part of the language independent API. @bhs thoughts?
Java is the only language that uses the builder pattern. An optional Boolean flag to startSpan() is a language-agnostic proposal.
Good point.