ipfs/in-web-browsers

IPLD support on Gateways

lidel opened this issue ยท 18 comments

lidel commented

Current state

Gateway provided by go-ipfs 0.8.0 supports only dag-pb (unixfsv1) and raw (raw block used for leaves) codecs.

Requesting any other IPLD type over a gateway fails.

Where we want to be

It should be possible to download everything over Gateway:

  • If it is impossible to provide web-compatible response, at the bare minimum we should return DAG archive (eg. as CAR โ€“ ipfs/kubo#170) so one can download it and then ipfs dag import <dag-archive> it to own node.
  • Gateway could return more useful response for some IPLD types (like dag-cbor)

Low-hanging fruit: traversable JSON/CBOR documents

Some ideas how to maximize the utility of gateways (those are just prompt for discussion, details TBD):

  • there should be a mechanism for controlling if dag-cbor and dag-json are returned as a valid JSON response with Content-type: application/json or application/cbor
  • it should be possible to traverse CBOR documents if one of the fields points at a CID

Ongoing work

Open questions

  1. Is CAR something we want to introduce, or do we want to wait for "CARv2" like anorth/go-dar
  2. Should it be possible to include non-unixfs nodes inside of a unixfs directory? (this impacts MFS and ipfs-webui)
  3. What should be the default response type for dag-cbor? User will be able to choose, but what happens when there is no user preference? (Original binary or should Gateway return JSON as its more user friendly and makes onboarding easier?)
    • I was initially locked on keeping original format at all cost, but I now see how returning JSON for dag-cbor by default make it work out-of-the-box in browser after copying and pasting the CID, which makes our stack "feel" approachable and easy to understand. This is huge for onboarding new users (devs).
  4. Should we support graphql-like queries against dag-cbor, so only specific fields are returned (think /ipfs/{dag-cbor-cid}?keys=image,name)? (This is separate from traversing CID tags in CBOR)
    • This could be a hidden killer feature when it comes to building web apps against the gateways.
      Any reason to not do this? Does this clash conceptually with planned support for selectors?
  5. Do we need to bikeshed how the format parameter should look like on Gateway?
    • This will be something that can be added to every path to override the default representation, used in both ipfs/kubo#8037 AND ipfs/kubo#7552
    • We already support --enc=json on CLI, ?enc=json is short and easy to add by hand to URL - trivial important for dx/ux
    • If we plan to support responses other than dag-json and dag-cbor, we make it accept name from multicodec/table.csv
      • If we go with multicodec table, ?codec= or ?format= may be better than ?enc= (are they?)
  6. ๐Ÿ‘‰ (i am sure there is more, please comment below)

cc @warpfork @alanshaw @Stebalien @aschmahmann

rvagg commented

Enthusiasm for CARv2 over DAR seems thin. cc @mikeal @rvagg

No, just enthusiasm over messing with CAR at this stage is a little thin, but none of that discussion should be considered a blocker, there's no gatekeepers on this. Maybe the experimental work on Lotus integration for either CARBS or DAR will help resolve this a little? If it turns out DAR unlocks value over there then maybe we double-down, make it a CARv2, write a spec and create an upgrade path where it matters. Either way though, there will have to be an upgrade path so integration in to our suite of libraries will be important and gateway export could use either (or CAR now and CARv2 later).

Re response types and the discussion in ipfs/kubo#8037, it would be really unfortunate to reinforce the JSON format that go-ipfs has relied on today over the more formal DAG-JSON form. Now that it's easier to interact with go-ipld-prime in go-ipfs, that will hopefully be unlocked? Being able to pass a ?format=X for any node would be interesting, where X could be dag-json which could even apply to raw and unixfs blocks but also any other format that can go through go-ipfs. X could be car too.

lidel commented
  • Ok, perhaps Gateway should avoid talking about specific CAR version, and simply say that it returns an opaque blob archive compatible with ipfs dag import|export ? I wrote some thoughts in #170 (comment)
  • I've added Open Question (5) regarding how to name the parameter for overriding response "format".
    • I am pretty sure that in real life this will be used for either JSON (plain text) or CBOR (binary) representations, so ?enc=json|cbor sounds like an easy pick for ipfs/kubo#8037 and ipfs/kubo#7552 but lmk if this is a controversial choice for some reason.

@lidel from the description I do not really understand what the default content type is expected to be when navigating to e.g.

https://ipfs.io/ipfs/bafyreiahmfp4l6ii7h4ceumxdpaxwss5dldbevuvftjokcm6tx26gtgriy

Right now it's 404 with response body: ipfs cat /ipfs/bafyreiahmfp4l6ii7h4ceumxdpaxwss5dldbevuvftjokcm6tx26gtgriy: unknown node type which is probably the worst option.

I would argue that default should be along the lines of what is displayed by
https://explore.ipld.io/#/explore/bafyreiahmfp4l6ii7h4ceumxdpaxwss5dldbevuvftjokcm6tx26gtgriy

As it would provide a much better experience in navigating non unixfs dags. Explorer view could be further improved to show file listing when links are dag-pb.

I think this still aligns with making gateway usable, as we Accept header / query parameter could be used to opt-in into data view as opposed to html view of things.

I think having such a view would also help us make a future case for encoding NFT as IPLD Dags that have IPLD links to all the assets as opposed to json file that happens to have ipfs:// style links to assets.

lidel commented

@Gozala Yes, it was unclear because we are trying to figure it out. I think you have a valid point: we already provide a basic GUI for directory listings, we should replace/improve it with something that works with non-dag-pb nodes.
This sounds like a project on its own tho. Would you or someone from IPLD team be interested in writing a proposal for it? I think people close to IPLD should drive it, I'll be happy to help with go-ipfs/gateway side of things.

Anyway, we should not wait with CBOR support until we have UI like that.

Let's return informative error to retry the request with explicit content type, as suggested in ipfs/kubo#8037 (comment) This enables us to add GUI in the future, without breaking anything that people build on top of CBOR-as-JSON responses.

Would you or someone from IPLD team be interested in writing a proposal for it? I think people close to IPLD should drive it, I'll be happy to help with go-ipfs/gateway side of things.

I would like that and I would have said yes, but I was already told too put all the proposals related work on the side, so I'm afraid I can't responsibly take this unless some resource relocation takes place.

shouldn't the logic happen in the http request header:

curl -H "Accept: */*" "https://ipfs.io/ipfs/bafyreidykglsfhoixmivffc5uwhcgshx4j465xwqntbmu43nb2dzqwfvae"

where cid is : base32 - cidv1 - dag-cbor - sha2-256-256-785197229dc8bb1152945da58e2348f7e279eeded06cc2ca736d0e879858b501

inspects the multicodec, which is 0x71 ( dagcbor) and return native format:

should return:

*   Trying ::1...
* TCP_NODELAY set
*   Trying 127.0.0.1...
* TCP_NODELAY set
* Connected to localhost (127.0.0.1) port 8081 (#0)
> GET /ipfs/bafyreidykglsfhoixmivffc5uwhcgshx4j465xwqntbmu43nb2dzqwfvae HTTP/1.1
> Host: localhost:8081
> User-Agent: curl/7.58.0
> Accept: */*
>
< HTTP/1.1 200 OK
< Clear-Site-Data: "cookies", "storage"
< Content-Type: application/cbor; charset=utf-8
< Location: http://bafyreidykglsfhoixmivffc5uwhcgshx4j465xwqntbmu43nb2dzqwfvae.ipfs.localhost:8081/
< X-Content-Type-Options: nosniff
< Date: Fri, 09 Apr 2021 15:34:12 GMT
< Content-Length: 94
<

* Connection #0 to host localhost left intact

Then the browser, if it understands application/cbor or ideally application/dag+cbor can consume it and download it etc.

Following on this, therefore the if the request header was different:

curl -H "Accept: application/json" "https://ipfs.io/ipfs/bafyreidykglsfhoixmivffc5uwhcgshx4j465xwqntbmu43nb2dzqwfvae"

even though the cid is dag-cbor, the request want application/json explicitly. Also doing this in the parameter ?enc=json seem redundant. Just need to register the dag+cbor as a MIME type.

*   Trying ::1...
* TCP_NODELAY set
*   Trying 127.0.0.1...
* TCP_NODELAY set
* Connected to localhost (127.0.0.1) port 8081 (#0)
> GET /ipfs/bafyreidykglsfhoixmivffc5uwhcgshx4j465xwqntbmu43nb2dzqwfvae HTTP/1.1
> Host: localhost:8081
> User-Agent: curl/7.58.0
> Accept: application/json
>
< HTTP/1.1 200 OK
< Clear-Site-Data: "cookies", "storage"
< Content-Type: application/json; charset=utf-8
< Location: http://bafyreidykglsfhoixmivffc5uwhcgshx4j465xwqntbmu43nb2dzqwfvae.ipfs.localhost:8081/
< X-Content-Type-Options: nosniff
< Date: Fri, 09 Apr 2021 15:34:12 GMT
< Content-Length: 94
<
{"hello" : "world"}
* Connection #0 to host localhost left intact

Gateway provided by go-ipfs 0.8.0 supports only dag-pb (unixfsv1) and raw (raw block used for leaves) codecs.
Requesting any other IPLD type over a gateway fails.

This is very misleading. IPFS supports unixfs (files) over the gateway, speaking of this in terms of IPLD codecs makes no sense (I can't, e.g., get a raw dag-pb node, etc.).

It should be possible to download everything over Gateway:

What about the read-only API?

Where we want to be

We do? Why?


I'd love a feature like this. I just want to make sure we're all on the same page about what "this" is, why we care about "this", etc.

  1. What problem are we trying to solve here? What use-cases are we trying to enable, why are existing features not sufficient, etc.
  2. What other solutions have we considered, etc.
lidel commented
  • Accept is something we want to support, but is not enough on its own: it does not work when pasted into address bar of a web browser, or as a target of a link on a webpage. It also gets filtered out by poorly written proxy/middleware software. We need query parameter for those use cases anyway.

  • /api/v0

    • is RPC-over-HTTP, was not designed for use in browsers, does not work on DNSLink websites, and we don't want to sink time into constantly fixing issues like ipfs/kubo#7959 or ipfs/kubo#6746 etc ...
    • I believe it is better to move away from /api/v0 for basic operations like fetching DAG, block or reading/converting CBOR/JSON document.

Created ipfs/kubo#8234 with initial design proposal for /ipfs/{cid}?format=car|block|dag-cbor|dag-json|.. that aims to streamline all those threads under something that is intuitive to use and extend.

lidel commented

Would it make sense to have a /ipld/ path on gateways for loading arbitrary IPLD data and for future support for POST /ipld/ similar to the writeable gateway spec for IPFS?

lidel commented

Good question! Whatever happens on gateways, the aim is to have it work with "native" schemes too. To reduce implementation overhead across ecosystem, we have the same abstraction on /ipfs/ and ipfs:// etc.

Theipld:// is not a thing atm, but my current thinking is that we will have IPLD support on /ipfs/ via ?format={ipld} or Accept header (wip: ipfs/kubo#8758)

In case of writable gateways, what we could implement is POST /ipfs/ that is sent with Content-Type: application/vnd.ipld.dag-cbor (or to /ipfs/?format=dag-cbor)

Would it make sense to have a /ipld/ path on gateways for loading arbitrary IPLD data and for future support for POST /ipld/ similar to the writeable gateway spec for IPFS?

Oh, now that would be interesting!

Theipld:// is not a thing atm, but my current thinking is that we will have IPLD support on /ipfs/ via ?format={ipld} or Accept header (wip: ipfs/kubo#8758)

Let's make 'ipld:' a thing!

I've floated ipld:// by @warpfork a bit and I also think it'd be useful to have coming from making apps in Agregore ๐Ÿ˜

lidel commented

I am open to making it happen, but the main question we will hit with bigger vendors and developers will be: what is the difference between ipfs://, ipns:// and ipld:// ?

Right now, the story is clear:

  • /ipfs/ - immutable namespace
  • /ipns/ - mutable namespace

What do we gain by introducing a second immutable namespace?
Which behaviors would be different?

I was thinking /ipld/ is for interacting with raw IPLD data, and /ipfs/ is for interacting with UNIXFS-like data.

E.g., doing a PUT to /ipfs/ assumes adding a file to a UNIXFS dir and the body is the content of a file, doing a PUT to /ipld/ requries it to be in some sort of IPLD-supported codec (maybe in the content-type header?).

Also potential for using the new IPLD patch spec for PATCH methods.

Some stuff that might be IPLD specific is stuff like Schemas, the different DAG traversal syntaxes and stuff like that.

At the moment it's a lot harder to do this fancy stuff with IPLD, and putting it into /ipfs/ where pretty much everything is file related doesn't feel as great.

Might be cool to brainstorm something more concrete if three's interest. ๐Ÿ˜

Started putting together a PR here: ipfs/specs#293

Comments on current direction would be very much appreciated. Note that a lot of this is based on stuff from this initial exploration report: https://github.com/ipld/ipld/blob/e6cfab631d2bd24bf158d3a85e126514c98de5ce/notebook/exploration-reports/2022.03-ipld-url-scheme.md#future-work