IPLD support on Gateways
lidel opened this issue ยท 18 comments
Current state
Gateway provided by go-ipfs 0.8.0 supports only dag-pb
(unixfsv1) and raw
(raw block used for leaves) codecs.
Requesting any other IPLD type over a gateway fails.
Where we want to be
It should be possible to download everything over Gateway:
- If it is impossible to provide web-compatible response, at the bare minimum we should return DAG archive (eg. as CAR โ ipfs/kubo#170) so one can download it and then
ipfs dag import <dag-archive>
it to own node. - Gateway could return more useful response for some IPLD types (like
dag-cbor
)
Low-hanging fruit: traversable JSON/CBOR documents
Some ideas how to maximize the utility of gateways (those are just prompt for discussion, details TBD):
- there should be a mechanism for controlling if
dag-cbor
anddag-json
are returned as a valid JSON response withContent-type: application/json
orapplication/cbor
- it should be possible to traverse CBOR documents if one of the fields points at a CID
Ongoing work
- @alanshaw is adding CBOR support in ipfs/kubo#8037
Open questions
- Is CAR something we want to introduce, or do we want to wait for "CARv2" like anorth/go-dar
- Should it be possible to include non-unixfs nodes inside of a unixfs directory? (this impacts MFS and ipfs-webui)
- What should be the default response type for
dag-cbor
? User will be able to choose, but what happens when there is no user preference? (Original binary or should Gateway return JSON as its more user friendly and makes onboarding easier?)- I was initially locked on keeping original format at all cost, but I now see how returning JSON for
dag-cbor
by default make it work out-of-the-box in browser after copying and pasting the CID, which makes our stack "feel" approachable and easy to understand. This is huge for onboarding new users (devs).
- I was initially locked on keeping original format at all cost, but I now see how returning JSON for
- Should we support graphql-like queries against dag-cbor, so only specific fields are returned (think
/ipfs/{dag-cbor-cid}?keys=image,name
)? (This is separate from traversing CID tags in CBOR)- This could be a hidden killer feature when it comes to building web apps against the gateways.
Any reason to not do this? Does this clash conceptually with planned support for selectors?
- This could be a hidden killer feature when it comes to building web apps against the gateways.
- Do we need to bikeshed how the format parameter should look like on Gateway?
- This will be something that can be added to every path to override the default representation, used in both ipfs/kubo#8037 AND ipfs/kubo#7552
- We already support
--enc=json
on CLI,?enc=json
is short and easy to add by hand to URL - trivial important for dx/ux - If we plan to support responses other than dag-json and dag-cbor, we make it accept
name
from multicodec/table.csv- If we go with multicodec table,
?codec=
or?format=
may be better than?enc=
(are they?)
- If we go with multicodec table,
- ๐ (i am sure there is more, please comment below)
Enthusiasm for CARv2 over DAR seems thin. cc @mikeal @rvagg
No, just enthusiasm over messing with CAR at this stage is a little thin, but none of that discussion should be considered a blocker, there's no gatekeepers on this. Maybe the experimental work on Lotus integration for either CARBS or DAR will help resolve this a little? If it turns out DAR unlocks value over there then maybe we double-down, make it a CARv2, write a spec and create an upgrade path where it matters. Either way though, there will have to be an upgrade path so integration in to our suite of libraries will be important and gateway export could use either (or CAR now and CARv2 later).
Re response types and the discussion in ipfs/kubo#8037, it would be really unfortunate to reinforce the JSON format that go-ipfs has relied on today over the more formal DAG-JSON form. Now that it's easier to interact with go-ipld-prime in go-ipfs, that will hopefully be unlocked? Being able to pass a ?format=X
for any node would be interesting, where X
could be dag-json
which could even apply to raw
and unixfs
blocks but also any other format that can go through go-ipfs. X
could be car
too.
- Ok, perhaps Gateway should avoid talking about specific CAR version, and simply say that it returns an opaque blob archive compatible with
ipfs dag import|export
? I wrote some thoughts in #170 (comment) - I've added Open Question (5) regarding how to name the parameter for overriding response "format".
- I am pretty sure that in real life this will be used for either JSON (plain text) or CBOR (binary) representations, so
?enc=json|cbor
sounds like an easy pick for ipfs/kubo#8037 and ipfs/kubo#7552 but lmk if this is a controversial choice for some reason.
- I am pretty sure that in real life this will be used for either JSON (plain text) or CBOR (binary) representations, so
@lidel from the description I do not really understand what the default content type is expected to be when navigating to e.g.
https://ipfs.io/ipfs/bafyreiahmfp4l6ii7h4ceumxdpaxwss5dldbevuvftjokcm6tx26gtgriy
Right now it's 404 with response body: ipfs cat /ipfs/bafyreiahmfp4l6ii7h4ceumxdpaxwss5dldbevuvftjokcm6tx26gtgriy: unknown node type
which is probably the worst option.
I would argue that default should be along the lines of what is displayed by
https://explore.ipld.io/#/explore/bafyreiahmfp4l6ii7h4ceumxdpaxwss5dldbevuvftjokcm6tx26gtgriy
As it would provide a much better experience in navigating non unixfs dags. Explorer view could be further improved to show file listing when links are dag-pb.
I think this still aligns with making gateway usable, as we Accept
header / query parameter could be used to opt-in into data view as opposed to html view of things.
I think having such a view would also help us make a future case for encoding NFT as IPLD Dags that have IPLD links to all the assets as opposed to json file that happens to have ipfs:// style links to assets.
@Gozala Yes, it was unclear because we are trying to figure it out. I think you have a valid point: we already provide a basic GUI for directory listings, we should replace/improve it with something that works with non-dag-pb nodes.
This sounds like a project on its own tho. Would you or someone from IPLD team be interested in writing a proposal for it? I think people close to IPLD should drive it, I'll be happy to help with go-ipfs/gateway side of things.
Anyway, we should not wait with CBOR support until we have UI like that.
Let's return informative error to retry the request with explicit content type, as suggested in ipfs/kubo#8037 (comment) This enables us to add GUI in the future, without breaking anything that people build on top of CBOR-as-JSON responses.
Would you or someone from IPLD team be interested in writing a proposal for it? I think people close to IPLD should drive it, I'll be happy to help with go-ipfs/gateway side of things.
I would like that and I would have said yes, but I was already told too put all the proposals related work on the side, so I'm afraid I can't responsibly take this unless some resource relocation takes place.
shouldn't the logic happen in the http request header:
curl -H "Accept: */*" "https://ipfs.io/ipfs/bafyreidykglsfhoixmivffc5uwhcgshx4j465xwqntbmu43nb2dzqwfvae"
where cid is : base32 - cidv1 - dag-cbor - sha2-256-256-785197229dc8bb1152945da58e2348f7e279eeded06cc2ca736d0e879858b501
inspects the multicodec, which is 0x71
( dagcbor) and return native format:
should return:
* Trying ::1...
* TCP_NODELAY set
* Trying 127.0.0.1...
* TCP_NODELAY set
* Connected to localhost (127.0.0.1) port 8081 (#0)
> GET /ipfs/bafyreidykglsfhoixmivffc5uwhcgshx4j465xwqntbmu43nb2dzqwfvae HTTP/1.1
> Host: localhost:8081
> User-Agent: curl/7.58.0
> Accept: */*
>
< HTTP/1.1 200 OK
< Clear-Site-Data: "cookies", "storage"
< Content-Type: application/cbor; charset=utf-8
< Location: http://bafyreidykglsfhoixmivffc5uwhcgshx4j465xwqntbmu43nb2dzqwfvae.ipfs.localhost:8081/
< X-Content-Type-Options: nosniff
< Date: Fri, 09 Apr 2021 15:34:12 GMT
< Content-Length: 94
<
* Connection #0 to host localhost left intact
Then the browser, if it understands application/cbor
or ideally application/dag+cbor
can consume it and download it etc.
Following on this, therefore the if the request header was different:
curl -H "Accept: application/json" "https://ipfs.io/ipfs/bafyreidykglsfhoixmivffc5uwhcgshx4j465xwqntbmu43nb2dzqwfvae"
even though the cid is dag-cbor, the request want application/json explicitly. Also doing this in the parameter ?enc=json
seem redundant. Just need to register the dag+cbor
as a MIME type.
* Trying ::1...
* TCP_NODELAY set
* Trying 127.0.0.1...
* TCP_NODELAY set
* Connected to localhost (127.0.0.1) port 8081 (#0)
> GET /ipfs/bafyreidykglsfhoixmivffc5uwhcgshx4j465xwqntbmu43nb2dzqwfvae HTTP/1.1
> Host: localhost:8081
> User-Agent: curl/7.58.0
> Accept: application/json
>
< HTTP/1.1 200 OK
< Clear-Site-Data: "cookies", "storage"
< Content-Type: application/json; charset=utf-8
< Location: http://bafyreidykglsfhoixmivffc5uwhcgshx4j465xwqntbmu43nb2dzqwfvae.ipfs.localhost:8081/
< X-Content-Type-Options: nosniff
< Date: Fri, 09 Apr 2021 15:34:12 GMT
< Content-Length: 94
<
{"hello" : "world"}
* Connection #0 to host localhost left intact
Gateway provided by go-ipfs 0.8.0 supports only dag-pb (unixfsv1) and raw (raw block used for leaves) codecs.
Requesting any other IPLD type over a gateway fails.
This is very misleading. IPFS supports unixfs (files) over the gateway, speaking of this in terms of IPLD codecs makes no sense (I can't, e.g., get a raw dag-pb node, etc.).
It should be possible to download everything over Gateway:
What about the read-only API?
Where we want to be
We do? Why?
I'd love a feature like this. I just want to make sure we're all on the same page about what "this" is, why we care about "this", etc.
- What problem are we trying to solve here? What use-cases are we trying to enable, why are existing features not sufficient, etc.
- What other solutions have we considered, etc.
-
Accept
is something we want to support, but is not enough on its own: it does not work when pasted into address bar of a web browser, or as a target of a link on a webpage. It also gets filtered out by poorly written proxy/middleware software. We need query parameter for those use cases anyway. -
/api/v0
- is RPC-over-HTTP, was not designed for use in browsers, does not work on DNSLink websites, and we don't want to sink time into constantly fixing issues like ipfs/kubo#7959 or ipfs/kubo#6746 etc ...
- I believe it is better to move away from
/api/v0
for basic operations like fetching DAG, block or reading/converting CBOR/JSON document.
Created ipfs/kubo#8234 with initial design proposal for /ipfs/{cid}?format=car|block|dag-cbor|dag-json|..
that aims to streamline all those threads under something that is intuitive to use and extend.
- Block/CAR support on HTTP Gateways is wip in ipfs/kubo#8758
- IPLD content types for Block/CAR are discussed in ipld/go-car#238
- Exposing IPLD Selectors on HTTP Gateways is discussed in https://github.com/ipfs/go-ipfs/issues/8769
Would it make sense to have a /ipld/
path on gateways for loading arbitrary IPLD data and for future support for POST /ipld/
similar to the writeable gateway spec for IPFS?
Good question! Whatever happens on gateways, the aim is to have it work with "native" schemes too. To reduce implementation overhead across ecosystem, we have the same abstraction on /ipfs/
and ipfs://
etc.
Theipld://
is not a thing atm, but my current thinking is that we will have IPLD support on /ipfs/
via ?format={ipld}
or Accept
header (wip: ipfs/kubo#8758)
In case of writable gateways, what we could implement is POST /ipfs/
that is sent with Content-Type: application/vnd.ipld.dag-cbor
(or to /ipfs/?format=dag-cbor
)
Would it make sense to have a
/ipld/
path on gateways for loading arbitrary IPLD data and for future support forPOST /ipld/
similar to the writeable gateway spec for IPFS?
Oh, now that would be interesting!
Theipld:// is not a thing atm, but my current thinking is that we will have IPLD support on /ipfs/ via ?format={ipld} or Accept header (wip: ipfs/kubo#8758)
Let's make 'ipld:' a thing!
I am open to making it happen, but the main question we will hit with bigger vendors and developers will be: what is the difference between ipfs://
, ipns://
and ipld://
?
Right now, the story is clear:
/ipfs/
- immutable namespace/ipns/
- mutable namespace
What do we gain by introducing a second immutable namespace?
Which behaviors would be different?
I was thinking /ipld/
is for interacting with raw IPLD data, and /ipfs/
is for interacting with UNIXFS-like data.
E.g., doing a PUT
to /ipfs/
assumes adding a file to a UNIXFS dir and the body is the content of a file, doing a PUT
to /ipld/
requries it to be in some sort of IPLD-supported codec (maybe in the content-type header?).
Also potential for using the new IPLD patch spec for PATCH
methods.
Some stuff that might be IPLD specific is stuff like Schemas, the different DAG traversal syntaxes and stuff like that.
At the moment it's a lot harder to do this fancy stuff with IPLD, and putting it into /ipfs/
where pretty much everything is file related doesn't feel as great.
Might be cool to brainstorm something more concrete if three's interest. ๐
Started putting together a PR here: ipfs/specs#293
Comments on current direction would be very much appreciated. Note that a lot of this is based on stuff from this initial exploration report: https://github.com/ipld/ipld/blob/e6cfab631d2bd24bf158d3a85e126514c98de5ce/notebook/exploration-reports/2022.03-ipld-url-scheme.md#future-work