w3ctag/design-reviews

Compression Dictionary Transport: Align on a single Content-Encoding for the web?

pmeenan opened this issue · 3 comments

Not so much a dispute as a decision that we did not think was appropriate to make at the HTTP-level for IETF but that may be worth discussing for the web use case...

I'm requesting the TAG express an opinion on a dispute related to:

We recommend the explainer to be in Markdown.

Explanation of the issue that we'd like the TAG's opinion on:

"Compression dictionary transport" allows for you to use a previous version of a resource or a separate dedicated resource as a compression dictionary for future requests at the HTTP layer (using content-encoding). It effectively allows for delta-compressing updates or extracting common parts of a page to a dictionary that becomes a "template" transparently. Usually resulting in a reduction of bytes transferred by 90+%.

As part of the IETF draft, we have defined content-encodings that are roughly equivalent to the existing br and zstd with specific window sizes that work well for web content up to ~50-100MB. The spec allows for future encodings that can be defined for new use cases as they come up while still leveraging the same encoding negotiation (huge files, content-aware diff algorithms, etc).

There is an open question if there would be more benefit to the web if we specified only one encoding, allowing clients to get the full benefit without having to include both Brotli and Zstandard decode engines. We agreed that it wasn't appropriate to limit the options as far as HTTP itself was concerned but that it would be worth considering for the web use case specifically.

There are slight differences to the capabilities of the currently-spec'd encodings:

  • dcb (brotli) compresses smaller at the maximum settings compared to dcz (Zstandard)
  • dcb is limited to ~50 MB resources for delta-updates while dcz can go up to ~100 MB (a limit we exercised in the Chrome origin trial with some large WASM applications)

Links to the positions of each side in the dispute (e.g., specific github comments):

Pick one encoding:

Allow for multiple:

What steps have already been taken to come to an agreement:

We have discussed at length in the issue and on the mailing list and largely decided that it wasn't appropriate to limit the options at the HTTP level but that it might make sense to discuss if it would be appropriate to consider for the web use case specifically.

There doesn't appear to be a conflict here that requires the TAG to weigh in. Closing this.

Is this the wrong way to ask the TAG to weigh-in or is it a matter of the TAG not having an interest in specifying content encodings for the web use case?

The specific question for the TAG is if they think the content-encodings available to dictionary compression should be curated (i.e. just Brotli) or if the clients and origins should be able to use whatever they want (and agree to in negotiation).

At the HTTP level when it makes it to RFC there will be no restrictions but there is an argument that has been made that site owners and web clients would benefit if, at least initially, the available encodings were artificially limited (kind of like Brotli was by virtue of browsers. Otherwise all clients will need to implement all of the encodings that Chrome implements as sites will deploy a variety of encodings (or sites will need to implement multiple if browsers don't ship an overlapping set).

Hi @pmeenan, conflict escalation is for when a WG can’t come to consensus and wants the TAG to help resolve the conflict. As far as we’re aware here there is consensus in the WG on how to proceed here, or at least there’s not an issue that requires more than normal WG process to resolve.

The TAG would be happy to weigh in with an opinion, but that would fall under a normal design review, not a conflict escalation.