w3c/trace-context

Clarify tracestate keys MUST be unique

codefromthecrypt opened this issue ยท 31 comments

Currently, uniqueness of tracestate keys is not defined anywhere. The closest it gets is the section on updating a key. However, there's nothing that says a key can't be added twice, at least not that I can find.

Code written poorly could easily merge together 2 tracestate headers and result in duplicate values. Consider messaging where you have a "current trace" and you are reading incoming state also.

This should be addressed with a MUST somewhere, ex MUST not duplicate keys or MUST ignore duplicate keys, reading the first.

I believe we can use later text here as justification to tighten more explicitly:

"Only one entry per key is allowed because the entry represents that last position in the trace. Hence vendors must overwrite their entry upon reentry to their tracing system."

We can also look at implementations which imply only one entry per key "https://github.com/open-telemetry/opentelemetry-java/blob/master/api/trace/src/main/java/io/opentelemetry/api/trace/TraceState.java#L62"

The concern here is to be explicit as merging can happen, especially as this spec was intentionally written to make splitting across multiple headers allowed (ex this is not allowed by default in HTTP, only for headers who are indicated as such *). A bug where something adds the same header twice or a concat bug could get into a situation of dupes.

I'd suggest a "read first key" clarification or decide to be more strict and throw it away if there are dupes.

* HTTP RFC 7230 allows encoding of multiple fields with the same name, but this only applies to fields which define themselves as comma-separated.

another way out is to reduce this to saying if it is considered malformed or not. If not malformed, I'm personally ok with the spec saying that implementations can decide which redundant entry to propagate down. We should at least say if it is considered malformed or not.

We have discussed this issue and similar at the bi-weekly meetings. I think the general sentiment is that because tracestate keys should only be read by one tracing system, it should determine how to handle the case that its own keys are duplicated. If a key is duplicated which your tracing system is unaware of, it is ignored and treated as opaque. This is muddied a bit by implementations which represent tracestate internally as a map which implies a single key. I agree that the wording could be made more explicit if this is behavior we decide is desireable.

@dyladan I think this commentary is a bit related and worth a read

https://github.com/openzipkin-contrib/brave-propagation-w3c#why-tracestate-is-mostly-unusable

I have been helping implement this but have generally been confused on the difference vs baggage. But talking with Adrian made clear to me that there really isn't a difference without the implementation of the vendor processing model, which at least OTel Java does not nor does the spec seem to clarify. Without it, though, it seems handling of duplicates in one way or another, or doing anything other than just passing through a dumb Map, doesn't seem to do much vs using baggage to accomplish the same. If there is indeed a difference, we need some real world examples in the specs to differentiate them I think.

/cc @jkwatson who really implemented tracestate and baggage in OTel and probably would prefer not to be dragged into this :) But another unknown for us with regard to tracestate has been how to deal with metadata. It's not well defined and seems like it could be dropped, as well as tracestate itself at that point given the ambiguity of the processing model.

/cc @jkwatson who really implemented tracestate and baggage in OTel and probably would prefer not to be dragged into this :) But another unknown for us with regard to tracestate has been how to deal with metadata. It's not well defined and seems like it could be dropped, as well as tracestate itself at that point given the ambiguity of the processing model.

I can't actually say that I implemented either of them, but I am familiar with the code, as I've cleaned it up and worked with it quite a bit. ;)

I think that we are touching upon a few different topics here:

The spec should be clear that keys in tracestate should be unique

There are different opinions on that. One group states that a key is always owned by a monitoring system and, as such, if a dupe occurs, it was introduced by them and need to be handled by them.
Another group states, that the specification should rule that out.
My personal opinion is, that no harm is done if we are being more specific here and disallow dupes in tracestate keys

Can we rely on the order of keys in tracestate?

As @adriancole states in the linked statement, OpenTelemetry right now only passes through the tracestate without modifying it.
This is spec-compliant, as the spec defines in https://w3c.github.io/trace-context/#mutating-the-tracestate-field

Vendors receiving a tracestate request header MUST send it to outgoing requests. It MAY mutate the value of this header before passing to outgoing requests. When mutating tracestate, the order of unmodified key/value pairs MUST be preserved. Modified keys MUST be moved to the beginning (left) of the list.

So, the mutation is not mandatory. This also implies that the order of the keys can just provide a hint and using some heuristics, one may be able to try to infer the caller, but it definitely does not provide information that can be trusted.

I can see that this may have been planned differently when adding the header but right now the general sentiment and practice seems to be that no-one trusts neither order nor content of third-party tracestate entries.

How is tracestate then any different from baggage?

tracestate is owned by the participating monitoring systems and can only be manipulated by them.
baggage, on the contrary, is accessible and mutable by userland code. A monitoring system like Zipkin or OpenTelemetry can provide accessors to it as well can frameworks like .NET.

These two headers are in fact orthogonal. They have a different purpose, different manipulation rules, different trust boundaries that need to be applied and are accessed by different parts of a running application

The tracestate header includes the parent in a potentially vendor-specific format:

I suspect people are forgetting wording like this, here specifically "Includes the parent". To say tracestate is orthogonal to baggage despite the reference implementation not reflecting this is very weird.

tracestate is owned by the participating monitoring systems and can only be manipulated by them.

Where is this documented?

What "reference implementation" of tracestate depends on baggage?

The tracestate header includes the parent in a potentially vendor-specific format:

I suspect people are forgetting wording like this, here specifically "Includes the parent". To say tracestate is orthogonal to baggage despite the reference implementation not reflecting this is very weird.

tracestate is owned by the participating monitoring systems and can only be manipulated by them.

Where is this documented?

bagagge is a spec currently in early draft status. Use cases will be documented. There is no reference implementation yet.

It's literally a different spec just driven by the same working group and fully orthogonal.

edit:
It might be good to figure out on which page you are and where this assumption is coming from so that we can be clear about it and fix things if there's something ambiguous.

While could be misinterpreting, I think of OpenTelemetry as a reference implementation of tracecontext and baggage. So looking at it, both TraceState and Baggage look very similar, just maps that are automatically propagated

https://github.com/open-telemetry/opentelemetry-java/blob/master/api/all/src/main/java/io/opentelemetry/api/trace/TraceState.java#L62

https://github.com/open-telemetry/opentelemetry-java/blob/master/api/all/src/main/java/io/opentelemetry/api/baggage/Baggage.java#L94

I'm wondering if these implementations are valid - TraceState definitely does not reflect the semantics listed here

https://w3c.github.io/trace-context/#relationship-between-the-headers

And because they are both maps that are propagated to remotes, I wonder why not just use baggage to propagate state about a trace? That's where the confusion is coming from. Perhaps tracestate-as-a-map is not a useful enough API for tracestate?

There should be no API for tracestate, it can only be accessed by the SDK, never by the end user.

One considerations for any MUST on a spec is a difference of platforms that enable propagation like ASP.NET and tracing systems. Platforms would love to avoid parsing tracestate at all unless there is a tracing system involved. trace-id and span-id are parsed and used by platform, while tracestate is just a set of opaque values that is simply passed unmodified.

Another example of a system that was discussed for tracestate is an edge systems that want to add a tracestate value by simply adding extra header tracestate that will be appended to the any tracestate header that was already present on the request. Later the tracing system that will do parsing can squash duplicate.

If we limit discussion to tracing systems, I can totally see the value of enforcing uniquiness. But since this spec covers platforms as well - enforcing anything that requires parsing is hard.

There should be no API for tracestate, it can only be accessed by the SDK, never by the end user.

there might be need for it is sampling priority will be propagated using the tracestate. So I believe it may be limited now, but exposed later.

There should be no API for tracestate, it can only be accessed by the SDK, never by the end user.

there might be need for it is sampling priority will be propagated using the tracestate. So I believe it may be limited now, but exposed later.

That will not require an API exposing the tracestate to the user. Maybe internal SPI in the SDK.

There should be no API for tracestate, it can only be accessed by the SDK, never by the end user.

there might be need for it is sampling priority will be propagated using the tracestate. So I believe it may be limited now, but exposed later.

That will not require an API exposing the tracestate to the user. Maybe internal SPI in the SDK.

Propagators need to be able to parse and manage tracestate entries, so that requires the tracestate to have a API surface in the API itself. I think this is a bit outside the scope of this particular issue, though. :)

While could be misinterpreting, I think of OpenTelemetry as a reference implementation of tracecontext and baggage. So looking at it, both TraceState and Baggage look very similar, just maps that are automatically propagated

To my knowledge, OpenTelemetry is compliant with the spec but it's not a reference implementation in the sense that it implements it fully as not all use-cases are applicable.

Speaking about use-cases:
tracestate exists because some monitoring systems need additional information passed around with a trace.
This is especially important for migration/legacy scenarios.
As an example, some vendors manipulate traceparent but - under the hood - only rely on their tracestate. This tracestate actually contains their old format as their backends can not yet deal with traceid and parentid.

OpenTelemetry does not have this problem. Everything OpenTelemetry needs at this point is traceparent.
As such, it's completely legit, that they just propagate it unchanged.

I can even think, that OpenTelemetry might never need it, but this does not change the fact, that many other systems do to cover their usecase.

Discussions here about what should or should not be available in an API, SDK, SPI, etc are irrelevant. We are only defining the underlying protocol for a tracing system to use. How they expose the information provided by the protocol is their business.

I agree with @adriancole 's initial proposal:

  • sender MUST ensure keys are unique
  • if a received header has non-unique keys, it is malformed
  • receiver SHOULD try to deduplicate non-unique keys via last-wins

I agree with @adriancole 's initial proposal:

  • sender MUST ensure keys are unique
  • if a received header has non-unique keys, it is malformed
  • receiver SHOULD try to deduplicate non-unique keys via last-wins

I also agree this is probably the best course of action. It is also the behavior that is, if not required by the spec, at least suggested by the processing model already. Making it explicitly required sounds like it can at least clear up some confusion.

Do we think that this change would be breaking or is it somewhat implied and now just more precise?
Also, maybe let's phrase it to the last added with would be the leftmost.

per https://github.com/w3c/trace-context/blob/master/spec/20-http_request_header_format.md

Only one entry per key is allowed because the entry represents that last position in the trace.

(although the part after "because" does not quite make sense )

I added #477 to capture one part of this discussion (sender must not add duplicates), as this seems to be consensus and it is already in the spec as pointed out by Yuri.

I think we still need to come to a conclusion with respect to what to do when you already receiving duplicates from upstream. Two alternatives have been mentioned:

  1. Vendors should only touch the keys they "own" and treat other key-value pairs completely opaque. (See here)
  2. When receiving the tracestate header, vendors can (or even should?) treat received tracestate values with non-unique keys, as malformed-but-fixable and can/should deduplicate non-unique keys via last-wins.

Both sentiments have been expressed in this issue with some consensus on option (2.) in the last few posts (Adrian, Yuri, Daniel D.)

My 2 cts: I actually lean more towards (1.). Reason: Vendors should not be obliged to spend any processing overhead on cleaning up issues that another vendor has caused (read: another vendor has screwed up). So from that perspective, (2.) should at the very least not be phrased as a SHOULD. And IMHO it is not worth having it as a MAY, except maybe for reducing the length of an overlong tracestate header.

@basti1302 I don't have objection to your option (1), except that making "treat other key-value pairs completely opaque" into a MUST could actually be hard to support when the implementation uses something like a linked hash map. I think it would need to be combined with "MAY deduplicate".

Yes, that's fair. How about something like this maybe:

Vendors MAY discard duplicate keys that weren't generated by themselves, but are not obliged to do that. When choosing to deduplicate a key that was not generated by themselves, the left-most key-value pair with that key must be kept.

I would prefer to avoid any prescriptive language about how deduping is done. The header is already malformed by the previous node, the current node can deal with dups any way they wish.

My thinking was that when you mess with key-value pairs you do not own, you should at least keep the one that has been added most recently (so the left-most) and discard the others. But you have a point, this can only happen if the other vendor is in violation of the spec, so whatever.

New suggestion:

Vendors MAY discard duplicate keys that were not generated by themselves, but are not obliged to do that.

"not generated by them" (else you could interpret it as keys generating themselves)

I'll let this sit for a while to see if other folks have a different take on this and then I can either add it to #477 or open a new PR. Thanks, Yuri!

I originally expressed support for (2) because it at least made it predictable what would happen if a key ended up duplicated somehow. I'm also completely fine with (1) as it is a smaller burden on the implementer. I also agree in general if a system sends a malformed header, they should expect something undefined might happen.

I have also added this aspect to the PR now: ccbabd6