Versioning for additional schema
Closed this issue · 11 comments
We presently have 3 documents: main schema, quic schema, HTTP/3 and QPACK schema.
We presently have 1 version field qlog_version
to express changes in any of these documents.
We make efforts to keep these things in lock step, but the relationship is quite implicit and not explicitly stated. That's not brilliant and adds some friction.
One proposal I have is to articulate the schema(s) used in the log. I don't think we need to go as far as XML does (e.g. https://www.oreilly.com/library/view/xml-in-a/0596007647/re168.html) but we could just include the I-D or RFC name as a string. Then each additional schema document can express what version of qlog main schema it depends on. I'll write this up as a PR.
As discussed on the call yesterday, everyone generally likes this concept on how to tackle versioning. Additionally, it can help replace protocol_type
together with #286.
One thing I personally did not really love in current PR #284 is that we use a separate approach for the main schema version (indicated via qlog_version
) and the protocol event defintion versions (indicated via the new additional_schema
field). Another option would be to treat the main schema as just another one in the list.
Put differently, this is what the PR currently proposes:
"qlog_version": "0.3",
"qlog_format": "JSON",
"additional_schema": [
"draft-ietf-quic-qlog-quic-events-03",
"draft-ietf-quic-qlog-quic-h3-events-03"
],
While a more consistent option would be:
"qlog_format": "JSON",
"qlog_schemas": [
"draft-ietf-quic-qlog-main-schema-04",
"draft-ietf-quic-qlog-quic-events-03",
"draft-ietf-quic-qlog-quic-h3-events-03"
],
Downsides I could find:
- We're stuck with the
qlog_schemas
field and its definition forever. No way to e.g., refactor it to a list of protocols and their versions (e.g., instead of"draft-ietf-quic-qlog-quic-events-03"
it could say["HTTP/3", "v5.1"]
). This is true for theqlog_version
field as well, but that seems like much less of an issue. - For parsers that can handle multiple different versions, it's easier to do a quick check on
qlog_version
, rather than having to parse the fullqlog_schemas
field to find the main-schema entry and decide if they support it or not.
I think those downsides are sufficient to go with the original proposal, but in that case, I'd rename additional_schema
to protocol_schemas
or event_schemas
(or the qlog_version
field to qlog_schema_version
;))
So the issue with a list of name and value tuples is, you'd need to go to the effort of defining naming strategy and a registry to hold those names to avoid conflicts. That's overhead I'd like to avoid.
qlog_schema_version
sounds fine to me.
I'd stuck with something like additional_schemas
though. There's no restriction I can think of that means new schema are constrained about what they can extend.
Thinking about this more and comparing with other similar projects, I think keeping the original proposal is fine, but that we should probably use the term "namespace" instead of "schema" (seems to be used quite consistently across XML, YAML, even CDDL 2.0).
Thus:
"qlog_version": "0.3", # version of the "default" namespace, which is the "main-schema" document
"qlog_format": "JSON",
"namespaces": [
"draft-ietf-quic-qlog-quic-events-03",
"draft-ietf-quic-qlog-quic-h3-events-03"
],
qlog doesn't define the term namespace anywhere, so it would have to do that. Realistically, I think that pushes us towards making this all much more explicit. I.e. providing guidelines defining namespaces and registering them in a new IANA registry. That's not far off what we talked about earlier with the tuple suggestion but I don't see a good reason to have a separate version field.
Discussed on the call: namespaces has some implicit connotations that imply things like versioning and uniqueness, which we don't want because they need IANA support. Keep additional_schema
and make clear the entries refer to datatracker documents only. Basically Merge the PR after doublechecking it's good :)
I don't love naming protocol elements after drafts or RFCs ... it gets very confusing if you ever publish a compatible update of a spec. (E.g.: what RFC defines the content of the media format message/rfc822
?). Also there should be a sensible way for people to define proprietary schemas that aren't defined by an IETF document.
It's maybe moderately annoying, but I'd suggest a urn:ietf:params:qlog:
namespace for these, with proprietary extensions taking whatever URI its creator wants to put in.
That's a fair point Lennox. Have you got any links to resources for registering and maintaining such a namespace?
The model I'm thinking of is the URIs identifying RTP Header Extensions, RFC 8285 - that's probably a good starting point to base yourself on. (Section 5 and Section 10.1.)
thanks!
It took us a while to get to the final design but thanks @JonathanLennox for the inspiration