ipfs/kubo

Gateway support for dag-json and dag-cbor responses

lidel opened this issue · 7 comments

lidel commented

TODO

Ability to request dag-json or dag-cbor response with:

  • ?format=dag-json, Accept: application/vnd.ipld.dag-json
  • ?format=dag-cbor, Accept: application/vnd.ipld.dag-cbor
  • TDB: should requesting Accept: application/json / Accept: application/cbor will return IPLD Data Model representation (as DAG-JSON/CBOR) of any IPLD-compatible CID (could be dag-pb, or whatever) with Content-Type: application/json
    • This is probably the killer feature for making IPLD interoperable with Web stack (JS, JSON everywhere), and allowing for creating custom UI in HTML/JS/CSS for data that ships with data itself.

When we have this, we should also:

Why

See ipfs/in-web-browsers#182

In short, to unlock the power of IPLD by making IPFS data that follows IPLD Data Model available as dag-json and dag-cbor HTTP gateway response formats .

It is already possible on the CLI:

$ ipfs dag get --output-codec=dag-json  $DIR_CID
{"Data": ...

Gateway support for this was previously attempted in #8037 – we need to refactor/rewrite that PR to follow conventions introduced in #8234 and #8758

Ecosystem impact around UnixFS

DAG-PB has a Logical Format which makes it possible to represent dag-pb directory as dag-json document.

This means ?format=dag-json will provide a way for supporting JSON responses for directory listing, which was also requested by our users.

Q&A

  • How should non-unixfs CIDs be rendered by gateway when no explicit format is requested by the client?
    • TBD, initial idea is to return 400 Bad Request error with HTML body that links to ?format=dag-json|dag-cbor|raw|car
      • key reason is to avoid implicit defaults, and train users and tools to always request specific response type where it matters
  • Given that dag-json and dag-cbor are subsets of JSON and CBOR, should we return responses as application/vnd.ipld.dag-json or appliation/json ?
    • TBD. A good case for returning application/json response is to improve interop with existing tools that speak JSON

initial idea is to return 400 Bad Request error with HTML body that links to ?format=dag-json|dag-cbor|raw|car
key reason is to avoid implicit defaults, and train users and tools to always request specific response type where it matters

I think this is a bad idea. It makes sense if user is a developer who you can train, but if user is someone just clicking you won’t train them, you’d just make their experience bad.

I would argue that:

  1. Gateway should still render something useful e.g ipld explorer like view or at least some links that users can click. Alternatively it could just redirect to URL with forma query param attached
  2. Status code still could be an error code if desired, yet body should probably provide useful content never the less. That way programs still going to interpret it as error while end users will still get to see a page they can interact with.

I think it is generally worth thinking who’s the gateway audience and who we want it to be. I think for devs we do have HTTP API, gateway should be for everyone and consequently less errors would mean better experience with IPFS

This would remove a lot of artificial barriers to mainstream adoption of the IPLD stack, i.e.the dag-cbor and dag-json codecs.

@bmann My guess is that this is also a challenge with WNFS?

  • ensure Size column in generated HTML Dir Index has the same value as Tsize in dag-json

    • it is important to have same numbers here UX-wise, and by switching to Tsize we could remove the need for feat(gateway): Gateway.FastDirIndexThreshold #8853 (Tsize is already in the root block of the dir, no need to fetch child nodes)
    • idea: on-hover title for Size column, noting "this is the total size of the DAG behind this CID, includes raw file data + IPLD metadata"

@lidel I want to note here that Tsize can be null for directories.

Was the idea here to change the API in the dir listing from Unixfs.Ls to Dag.Get?

lidel commented

Context here is that we want to do in dir-index-html code is to STOP fetching more than the root UnixFS node: #9058 (this is out of scope here, I've updated issue and pointed at #9058)

Shouldn't it be application/vnd.ipld.dag+json and application/vnd.ipld.dag+cbor, i.e. with the +, to indicate that it is a subset of those formats using the structured syntax?

lidel commented

@IllidanS4 no, "DAG-JSON" and "DAG-CBOR" are not "just json and cbor view of DAG", these are names of specific formats that are subsets of each:

There is no generic "DAG": if you don't know the format of data behind a CID, you can only retrieve it as a single raw block or an opaque bag of blocks in CAR, that is why we have application/vnd.ipld.raw and application/vnd.ipld.car and no application/vnd.ipld.dag.

@lidel I see, thanks; I wasn't aware it should be synchronized with the multicodec table. Perhaps then it could have been application/vnd.ipld.dag-json+json etc. but that is a bit cumbersome. Nevertheless a similar scheme has been used for other formats, like application/rdf+xml which in a sense stands for RDF/XML and not just RDF, and similarly application/ld+json for JSON-LD (and there is no LD).