ipfs/specs

Create IPIP: UnixFS and Gateway support for explicit MIME/Content-Type

lidel opened this issue · 1 comments

lidel commented

This is a placeholder issue for creating IPIP to add support for storing explicit MIME/Content-Type in UnixFS DAG itself, like we already do for opt-in mode and mtime, and acting on its presence on Gateways.

There is alrernative approach to allow stating explicit content-type header via _headers file (similar to recently shipped _redirects) – some notes in #257

Context

UnixFSv2 never happened, but ability to explicitly specify Content-Type was one of my asks: ipld/legacy-unixfs-v2#11

Years later, we still guess Content-Type on gateways.
While it is fine for most of the time, but we should provide users with ability to explicitly set media type at the time of data onboarding.

Since 2018 we made some related development: we've added opt-in support for mode and mtime attributes. ~2020 (#217 (comment)). Support for mode and mtime was implemented in JS-IPFS a while ago, Kubo still has PR open (ipfs/go-unixfs#117).

Initial idea (details to be fleshed out in IPIP) is to introduce optional mtype similar way.
This is alternative to introducing _headers file mentioned in #257.

Ref.

TODO

  • wait until UnixFS specs land (#331)
  • create IPIP against UnixFS specs that
    • defines canonical way of storing explicit MIME (content/media type) in UnixFS root dag-pb blocks.
    • modifies Gateway spec to disable mime-sniffing and use sanitized (ASCII-only) value from dag-pb in Content-Type header
  • Create reference implementation in Kubo
    • adds optional --content-type to ipfs add
    • skip sniffing and set Content-Type header on gateway, if present
lidel commented

Need some help with historical context:

  • UnixFS 1.5 added Data.mode and Data.mtime fields
  • What we propose in this issue is "UnixFS 1.6" (tbd)
    • My initial idea was to add Content Type field as Data.ctype..
    • ... but I've noticed there is an existing Metadata.MimeType field in unixfs.proto already (seems to be not used for anything anywhere tho).

Feels like there is a reason why Metadata was not used for mtime and mode (see js-ipfs-unixfs/unixfs.proto)

@achingbrain @alanshaw do you remember why these fields landed in Data and not Metadata? Or who would be good person to ask?