w3c/json-ld-syntax

Allow JSON-LD documents to "pin" their @contexts and @imports

jyasskin opened this issue · 2 comments

https://medium.com/@markus.sabadello/json-ld-vcs-are-not-just-json-4488d279be43 by @peacekeeper points out that if someone signs the bytes of a JSON-LD document, the recipient of that signature can't be confident they're reading the same meaning that the signer intended, because the @context value might have changed by the time they fetch it.

The Verifiable Credentials spec approaches this problem in https://w3c.github.io/vc-data-model/#base-context by saying that implementations must treat https://www.w3.org/ns/credentials/v2 as already-retrieved, and that the fixed version must have a particular hash. This works for that base context, but as @peacekeeper pointed out, VCs can still include other contexts that aren't constrained. This approach also isn't grounded in the JSON-LD algorithms, which just say to dereference the @contexts' values without this hash check.

https://w3c.github.io/webappsec-subresource-integrity/ solves this problem for subresources within HTML by allowing authors to specify the expected digest for the bytes of the subresource. Parts of that design might be useful for building a similar capability into JSON-LD.

It's also possible to cache the @contexts (and any other dereferenced fields, like @import) using their digests. https://hillbrad.github.io/sri-addressable-caching/sri-addressable-caching.html discusses some difficulties for the web context, but many of those might not apply to most JSON-LD use cases.

My understanding is that relatedResource is a proposed way to address this, e.g. see this example:
https://w3c.github.io/vc-data-model/#example-usage-of-the-relatedresource-property

That section also references and reuses the Subresource Integrity spec.

relatedResource on its own is an untyped link, akin to <a href="..."> with no other attributes (most commonly rel), and should not be encouraged in our quest to make the web more semantically processable.

There has been a fairly large amount of discussion of the general topic of Data and Resource Integrity in the context of the Credentials CG and the Verifiable Credentials WG.

This also feeds into the question of whether it's needed to address a document including all its intricacies (comments, formatting, etc.), or some content of that document (such as the RDF statements contained within a Turtle document, where the Turtle could be changed substantially without changing the RDF), and that led to the launch of the RDF Dataset Canonicalization and Hash Working Group.

See also