Loggable stringified version
Opened this issue · 13 comments
In writing CID i wanted a way to expand a CID into a human readable log version, for development. So i wrote this:
This means multihash needs an equivalent version.
@greglook already wrote one over at https://github.com/multiformats/clj-multihash it looks like:
hash:sha2-256:dbd318c1c462aee872f41109a4dfd3048871a03dedd0fe0e757ced57dad6f2d7
My comments on it there: multiformats/clj-multihash#7
Relevant comments from: multiformats/clj-multihash#7
I would discourage use of
:
end prefer-
because it would be great if people use the compressed representation in URNs instead of the expanded one... but i can be persuaded.(I would want to ensure people don't use the string versions to rely on them for identifiers-- as the whole copy-pastability and versatility of multihash goes down with that).
See also
I like it generally! Nitpicks:
- We might wanna consider a less general prefix, e.g.
mhash
ormultihash
- I agree about
:
as a delimiter being less than ideal, but so is-
, since it already clashes in the simplest example withsha2-256
:) In the context of URNs,&
or;
might fit?
I agree about : as a delimiter being less than ideal, but so is -, since it already clashes in the simplest example with sha2-256 :) In the context of URNs, & or ; might fit?
On the other hand we're talking about a human-readable version, where this is less of an issue.
I agree about : as a delimiter being less than ideal, but so is -, since it already clashes in the simplest example with sha2-256
This was the reason I used :
in the URN form, otherwise you can't easily tell whether -256
is part of the algorithm name or a different field in the multihash.
We might wanna consider a less general prefix, e.g. mhash or multihash
This would also be good - as far as I could find at the time, there was no real accepted standard for the hash
URN namespace, so I went with the simplest version I could think of.
i wouldnt be against claiming the hash:
prefix -- particularly since our goal is to make it easier to work with many hashes and we would commit to being good stewards of the namespace -- but we may have to put a real bid for it and be ready to change it if it doesn't fit
Ok, I'm okay with :
The mutliformats home page says
They MUST have a human-readable representation.
Was a human-readable representation ever standardized for multihash?
I don't believe it has ever been finalized.
See this independent draft which could be defined on top of multihash specification instead being part of it. Note that the digest length is implicitly given by the length of the string, we only need to agree on canonical hash function names in addition to the hash function identifiers.
There is also https://github.com/w3c-dvcg/hashlink spec, which leverages multihash to make uris
At https://multiformats.io/#what-are-multiformats there are some stipulations:
- They MUST be in-band (with the value); not out-of-band (in context).
- They MUST avoid lock-in and promote extensibility.
- They MUST be compact and have a binary-packed representation.
- They MUST have a human-readable representation.
The last 2 are important. At the moment I am aware of the binary representation of the multihash, however, I didn't see any human-readable representation. Is this issue about adding that human-readable representation?
Also, I see you guys talking about reserving hash:
or something, but why not just go with #
? THE HASHTAG!!! For example, #sha2-256:dbd318c1c462aee872f41109a4dfd3048871a03dedd0fe0e757ced57dad6f2d7
.
@ben221199 The hashtag is used in URI syntax for referencing content within a document, so using multihashes in this context makes no sense. A human-readable representation should also be a valid URI to be used as identifier where URIs are required (e.g. RDF), but the hastag has nothing to do with it.