jdougan/zeros-silo-2021

More than content-type and x-secondlife-* headers should be stored and retrieved

jdougan opened this issue · 5 comments

Add to the regex to store x-silo-* headers? What is the modrrn mechanism for header extensions?
Maybe put multiformat/multihash headers here?

What to call any new headers? The traditional X- has been deprecated (RFC-6648) and no solid recommendation has emerged.

As x- has been deprecated, maybe use a reverse domain name?
net.opencobalt.silo.*
net-opencobalt-silo-*
net_opencobalt_silo-*

And the dot is reserved for future header structure standards, so dash or underscore it is.
net-opencobalt-silo-multihash: 567578585687687667867665476865678676045653597595
net_opencobalt_silo_multihash: 567578585687687667867665476865678676045653597595

DNS labels can have dashes but not underscores. Substitute underscores for the for DNS period separator as well as a structural separator.
net_open-cobalt_silo_multihash: 567578585687687667867665476865678676045653597595

Maybe add a date like the fdc URNs (https://datatracker.ietf.org/doc/html/rfc4198) and tag URIs (https://datatracker.ietf.org/doc/html/rfc4151) do?
net-opencobalt-silo-2021-multihash: 45357834569873465783465763478956873426564387
net_open-cobalt_silo_2021_multihash: 45357834569873465783465763478956873426564387

We can't use them now, but a couple examples of a structured header label prefix format:
fdc.net_opencobalt_silo.2021.multihash: 45357834569873465783465763478956873426564387
fdc.net_open-cobalt_silo.2021.multihash: 45357834569873465783465763478956873426564387

fdc urns use the non-reversed domain name.
fdc.silo_opencobalt_net.2021.*
fdc.silo_open-cobalt_net.2021.multihash: 45357834569873465783465763478956873426564387

Other option is define a single field with a list of values:
net-opencobalt-silo: "key=value", "k2=v2"
which would be easier to register, but otherwise more painful to parse.

Is it possible/reasonable to simply put Store and Retrieve as a generic "Store this thing" registration. Sort of a reverse cookie?

Maybe see if any existing field will work. Cookie:/Set-Cookie: isn't it, that is for the reverse direction.
Link: would be potentially useful. Base:/Content-Base: in conjunction with Link: would be useful. Many of the Content-* fields should be considered. URI: is a contender.

https://www.rfc-editor.org/rfc/rfc3061.html

Should document current limitations of the header field storage.

It appears that multiple headers of the same key may be supported. All it is doing is copying the headers that match a regex off to a .meta file then copying the lines that match a different regex back into the GET/HEAD response.

The actual problem is the Python test client is using the standard Python HTTP API, which takes a dict for extra headers, so you can't have more than one header with the same key. Need to either go third party or use a lower level to test the headers properly,

Potential standard headers to store

  • URI
  • Link
  • Alternates
  • Derived-From
  • Base - Deprecated
  • Content-Base
  • Content-Disposition
  • Content-Encoding
  • Content-ID
  • Content-Language
  • Content-Length - Implicitly stored
  • Content-Location
  • Content-MD5 - checksum only as md5 has become weak.
  • Content-Range - No, not useful
  • Content-Script-Type
  • Content-Style-Type
  • Content-Type - Explicitly stored already
  • Content-Version
  • Digest
  • From
  • Last-Modified - Generated by the silo, time of last put
  • Transfer-Encoding

Final contenders for new headers

Structured:
org_voidrandom_silo_2021_*
net_opencobalt_silo_2021_*

Randomized:
01b9d740-e7a1-42e6-8ec1-efe621f5bb45_*
01b9d740e7a142e68ec1efe621f5bb45_*
meal-sector-power-police_*

Proposed solution is to keep the existing behavior of X-SecondLife-* headers (store, no retrieve) and add storing and retrieving of 01b9d740e7a142e68ec1efe621f5bb45_* keys.