w3c/server-timing

expand description field to allow json instead of just quoted-string and token

Opened this issue · 6 comments

A common use for the description field is to add additional details that can be used by other visualization or systems. However this is limiting because it does not allow the use of json values nor other name/value pair notations. I propose that instead of using RFC7230's quoted-string value we expect the value to be a json value.

For example, if the server wanted to include the attributes request-id, o-bytes and product-id in the desc field, in the following json:

{
  "request-id": "8861050c",
  "o-bytes": 580970,
  "product-id": "e39141b",
  "cache": "hit"
}

To include this in the desc entry all the quotes must be escaped:

Server-timing: origin; dur=50; desc="{\"request-id\":\"8861050c\", \"o-bytes\": 580970, \"product-id\": \"e39141b\",\"cache\": \"hit\"}";

However, even this is still invalid because of the use of {, }, and :. Other common field delimitated characters such as =, ',' and ';' are likewise prohibited. This means that even if you wanted to create a pseudo name-value pair set you have to use a limited vocabulary.

[Edit: this is incorrect, these values are characters. Still, the double escaping causes bloat in the description field and makes the description field illegible without both first un-escaping the desc field and then parsing as json. ]

A viable solution is:

Server-timing: origin; dur=50; desc="requestid_8861050c+obytes_580970+productid_e39141b+cache_hit";

This is problematic because I have invented my own token syntax that is non standard and requires any consumer to be likewise aware of my use of _ and +.

To address these issues, I would propose that the desc= field should use json as a valid value. I would further propose that the spec require desc= to occur only once on a given Server-Timing entry, and be required to be the last value to as not to confuse the parser of with other tokens.

Server-timing: origin; dur=50; desc={"request-id": "8861050c","o-bytes": 580970,"product-id": "e39141b","cache": "hit"}

(Alternatively, the desc= could be expanded to allow for token, quoted-value or json but I don't know what the implications are for the parser implementation)

{ (x7b), } (x7d), and : (x3a) are all legal in quoted-strings, see definition of qdtext:

     qdtext         = HTAB / SP /%x21 / %x23-5B / %x5D-7E / obs-text

I will add a test cases for this everywhere I can.

you are right, within dtext these are valid characters. I've updated the issue.

However, the desc requires all these double quotes which makes the value illegible outside of the context and requires post processing because of the extra superfluous characters. If we are concerned about the 5 characters of duration => dur then we should be likewise concerned about all the extra \ characters

For readability and to save some bytes on the wire, serialize your JSON with ' (single quote) instead of ".

Sure. But now it's not valid json (per rfc7159 " are used to escape names and values). Again consumers will not only have to know to s/'/"/g but now also s/\\'/'/g. The intention here is that the consumer (human or machine) should not have to know any prior parsing contract and the contents should be natural to digest.

Paging @mnot for guidance. Mark, what's the status of structured headers proposal, and what's the best way to tackle what we have here at this point?

mnot commented

It isn't currently a goal for structured data to carry arbitrarily nested structured data, nor to carry JSON.

However, you could carry JSON inside a SH string, and it would be correctly escaped and un-escaped by SH tooling on each end. Yes, you'd have a bit of overhead with the escapes.

Why can't those attributes be put in other params? It doesn't seem like great practice to overload desc like this -- my understanding was that it was intended for presentation to humans (e.g., in devtools).