ipfs/ipfs-companion

Support Custom Protocols in WebExtension

lidel opened this issue · 17 comments

lidel commented

(living summary: updated @ 2018-09)

This issue tracks browser extension support for various protocol schemes and URIs according to four stages of the upgrade path for path addressing and IPFS Addressing in Web Browsers memo, namely:

URLs:

ipfs://{cidv1b32}http://127.0.0.1:8080/ipfs/{cidv1b32}
ipns://{hash}http://127.0.0.1:8080/ipns/{hash}

URI:

dweb:/ip[f|n]s/{hash}http://127.0.0.1:8080/ip[f|n]s/{hash}

This issue also tracks (currently nonexistent) ways to address/workaround how Origin is calculated (Problem #2)

WebExtension APIs

Universal API

No such thing yet (but see WIP work in comments)

  • BUT! We use the most insane workaround in #164 (comment). It works in both Firefox (Desktop-only – #348) and Chrome and is the best thing we can do as of late 2017. It was merged with #283.

Firefox

Chrome / Chromium

Brave

Muon-based (deprecated in 2018)

  • browser.protocol.registerStringProtocol available to trusted extensions
  • missing api for Buffer/Stream-based protocols (#312 (comment))

Chromium-based

  • Basically same challenges as Chromium right now

Related discussions

lidel commented

Support for simplified redirect-based protocol handler (simple redirect, no origin support) landed in Firefox 54:

This a a thin UX on top of Navigator.registerProtocolHandler():

Note that this is just a redirect:

This means the address in location bar changes (Origin barrier is broken) and there are different security context for public and local gateways (different cookies, etc).

And yes, this is Firefox-only.

lidel commented

It seems that this new API has the same naming limitation as Chrome, firefox-54.0a2 says:

Reading manifest: Error processing protocol_handlers.0.protocol: Value must either: be one of ["bitcoin", "geo", "im", "irc", "ircs", "magnet", "mailto", "mms", "news", "nntp", "sip", "sms", "smsto", "ssh", "tel", "urn", "webcal", "wtai", "xmpp"], or match the pattern /^(ext|web)+[a-z0-9.+-]+$/

This means we can't support ipfs://hash anymore and are forced to use web+ipfs://hash 👎

When we get official registration of the schema we should be able to request the exemption to be added.

lidel commented

I've just tested a ridiculous PoC that restores support for ipfs:// dweb: etc in WebExtension:

  • We know that:
    • permission <all_urls> fires onBeforeRequest for every URL
    • when URL starting with an unknown protocol is entered in location bar, a browser url-escapes entire address and converts it to a search query, e.g:
      ipfs://QmbWqxBEKC3P8tqsKc98xmWNzrzDtRLMiMPL8wBuTGsMnRhttps://www.google.com/search?q=ipfs%3A%2F%2FQmbWqxBEKC3P8tqsKc98xmWNzrzDtRLMiMPL8wBuTGsMnR&ie=utf-8&oe=utf-8&client=firefox-b-ab
  • So what happens if you:
    • detect requests with URL containing url-encoded :/ ( %3A%2F)
    • find ones that match a query starting with one of custom protocols: /=(ipfs|ipns|dweb)%3A%2F(%2F[^&]+)/
    • extract IPFS path and check if passes IsIpfs.path test
      • if so, replace request to search engine with request to resource at public gateway 'https://ipfs.io/' + extractedIpfsPath
      • if not, return original request

Well.. it just works 🚀 🙃 (not sure if I am proud or ashamed, probably both)

Expect PR in near future. My plan is to polish and ship this workaround as the default behavior, so that there is no functional regression when users update from v1.6.0 to v2.x.x.

Of course there will be a preference that disables this: to pass review at Mozilla and in case someone really wants to search for ipfs://QmbWqxBEKC3P8tqsKc98xmWNzrzDtRLMiMPL8wBuTGsMnR

Update 2017-12-27: this hack does not work on Android – see #348

lidel commented

Custom Protocols in WebExtension – status update:

Poor Man's Protocol Handlers (#164 (comment)) are enabled by default in v2.0.10.

dag commented

The ipfs scheme has no "authority" as per the RFC so it should not include the double forward slashes. It is debatable whether the ipns scheme includes an authority; it includes a domain name, but it's all routed through IPFS and for example there is no port number and the domain isn't necessarily a peer. Thus even with IPNS the domain is merely an identifier and not an "authority". I therefore propose that it would be more in compliance with the specification to have:

  • ipfs:{hash} and
  • ipns:{domain}

Double slashes are a wart of the URI spec, so it's nice to be able to avoid them. The spec defines authority thus:

authority   = [ userinfo "@" ] host [ ":" port ]

IPFS is an unauthenticated peer-to-peer network, so there is no userinfo and no singular host or port.

Why I could be wrong

I don't think I'm wrong for the ipfs scheme, but I'm less certain for the ipns scheme. It could be argued that when the RFC defines host to include "registered name", it is very vague as to what that means and it could apply to IPNS. But I think most people would expect that if there is two slashes followed by a domain name, the computer will connect to that domain name, and it could be followed by a port number. Neither of those are necessarily or at all true with IPNS. Double slashes signify the existence of a centralized authority; the antithesis of IPFS.

https://tools.ietf.org/html/rfc3986#section-3.2

URN?

It could also be argued that the proper scheme to use is urn. That's what's used for things like ISBN and UUID, which are a lot like IPFS. urn:isbn:0451450523, urn:ipfs:QmbWqxBEKC3P8tqsKc98xmWNzrzDtRLMiMPL8wBuTGsMnR... Just as ISBN tells you what to look up rather than where, so does an IPFS hash or IPNS name tell you what to look up, not which server to connect to. Personally I always thought URNs were kind of redundant though, why not just isbn:{num}?

lidel commented

@dag Thanks! Mind moving it to ipfs/specs#152? It is a better place for this discussion, more people are watching that repo. This issue is only about implementing current consensus in the browser extension :)

(If you are interested in tl;dr with rationale behind use of URL for ipfs:// and ipns:// and URI for dweb:, see "Four stages of the upgrade path for path addressing")

lidel commented

As a result of a thread at dev-addons Andre opened a Bugzilla ticket to Whitelist custom protocol handlers for work on decentralization technologies with WebExtensions.
If accepted, Firefox users will no longer have to rely on hack from #164 (comment), but will get UX from #164 (comment) without web+ prefix.

A small step, it would be still just a redirect, not a real protocol handler, but worth mentioning here, as it would solve #348 :)

Of course redirect-based handler this does not solve Problem #2, we need New, Native, Programmable Protocol Handler API for WebExtensions for that.

@lidel it worked! The patch landed on nightly (I think, I am new to this process). Check out https://hg.mozilla.org/mozilla-central/rev/c2cb8a06bcf1

(PS: I never used IPFS but I thought that since I was going through the process of doing a patch to whitelist scuttlebutt, I should include ipfs and dat as well... glad it worked)

Also I was checking a bit from this thread, and (please correct if I am wrong) apparently you're intercepting a protocol and doing a web redirect. Is that right? If so, then I believe you might be able to use the WebRequest API from WebExtensions to intercept the request and fiddle with the origin handlers you wanted, it might help there.

lidel commented

@soapdog that is really cool, thank you!! 🎉 ❤️
We'll switch to non-prefixed handler as soon it lands in regular Firefox release. 👌

Yes, we are intercepting and redirecting. The problem is that we want to control Origin in the browser itself, to isolate DApps from each other. AFAIK modifying Origin header in requests will only spoof Origin for remote server, while things like JS on a page will still use standard Origin for cookies and storage. Proper security context for IPFS won't work without webextension support for synthetic origins/protocols.

Update 2018-01-15: Bugzilla ticket is marked as resolved/fixed in Firefox 59.

I was one of the authors on some drafts of some of those specs (URN and URI) so I can give you some of the thinking that went behind decisions.

Double slashes were in there to distinguish them from a single leading slash. i.e /foo means go to the same protocol, same host, and ask for /foo. While //xx.yy.zz meant to go the xx.yy.zz host first and ask for /foo. That was a feature of all known protocols at that time and as pointed out doesn't mean anything in a host-less world.

urn: was in there for two reasons (instead of isbn)
a) because the part before the : implied a protocol, what you use to talk to the resolver, in practice it told you what library to hand the query off to.
c) there is no isbn protocol, so it shouldn't really be before the first :, at the time there was a good likelyhood of there being a protocol for resolving URNs.
c) because its a hard place for extensibility. Netscape added urn: around 1996 on my request to allow us to experiment, but there was no way they were going to add one protocol for each of isbn, and each of the other naming authorities, especially without any library/protocol to hand it off to.

lidel commented

FYSA some recent developments (Q3 2018):

  • Experimental Protocol Handler API for WebExtensions is being designed as a part of mozilla/libdweb:

    The Protocol API allows you to handle custom protocols from your Firefox extension. This is different from the existing WebExtensions protocol handler API in that it does not register a website for handling corresponding URLs but rather allows your WebExtension to implement the handler.
    More: https://github.com/mozilla/libdweb/#protocol-api

    • Still experimental, requires a special build, but the goal is for this to land in regular builds of Firefox Nightly at some point in the future ✨

    • We are working on a PoC handler that loads data over raw IPFS and keeps ipfs:// in address bar. It can be found in libdweb branch. More info in PR #533 and /libdweb/docs/libdweb.md

  • Safelisting DWeb Protocols | arewedistributedyet/arewedistributedyet#23

    • tl;dr The goal is to be able to register it in all browsers without web+ prefix.
      (Chrome 67 requires web+ prefix for non-safelisted protocols)
    • Firefox already allows non-prefixed version, Chrome published intent-to-implement.
lidel commented

If DWeb protocols get safelisted, we need manifest.json/protocol_handlers feature parity to solve UX issues mentioned in arewedistributedyet/arewedistributedyet#23 (comment)

Filled a Chrome bug: Extensions API should implement manifest.json/protocol_handlers.

According to the URI syntax in RFC3986, it should not be "ipfs://{cidv1b32}", instead it should be "ipfs:{cidv1b32}". Same for {hash}. The extention should then use its default IPFS instance to retrieve the object {cidv1b32}. See https://tools.ietf.org/html/rfc3986#section-3

'//' denotes a specific instance. So an IPFS service on localhost could be addressed with "ipfs://localhost/{cidv1b32}", while IPFS running on ipfs.io would be "ipfs://ipfs.io/{cidv1b32}"

lidel commented

Those are good points, thanks for raising them @ingokeck
We would love to be as compliant with RFCs as possible, as long we don't compromise Origin isolation (details below), so this needs deeper analysis.

  • 💚 For the sake of UX, I believe both should work. For example, you can use ipfs:bafybeigdyrzt5sfp7udm7hu76uh7y26nf3efuylqabf3oclgtqy55fbzdi in Firefox today and if you have IPFS Companion installed, it will resolve just fine. If we flip the default, the reverse should be true as well.

  • 💛 Removal of // needs discussion with Brave, Opera, and other community members, but I suspect the only reason for // is technical debt in web browsers and being pragmatic. Browsers are HTTP-centric and existing implementation leverage this property of RFC3986:

    Non-validating parsers (those that merely separate a URI reference into
    its major components) will often ignore the subcomponent structure of
    authority, treating it as an opaque string from the double-slash to
    the first terminating delimiter, until such time as the URI is dereferenced.

    For example Brave v1.19 uses CID as "authority" component and build Origin based on it. This provides us with security sandbox per content root (CID), but has a side-effect of // in the address bar.

    TLDR Origin isolation is way more important than being visually compliant with RFC// can be removed only if Origin isolation per CID is maintained, which I don't believe is possible atm.

  • 💔 ipfs://localhost/{cidv1b32} is incompatible with our security model given how addressing is implemented in browsers

    • Each CID needs to create own Origin, and this will put all CIDs under single Origin, so a big NO.
    • Yes, this could be special-cased, but is very unlikely any vendor will want to touch critical code paths for the sake of cosmetics. In the past we looked into something simpler with Suborigins and even that did not go anywhere: ipfs/in-web-browsers#66
lidel commented

5 years after Firefox, Google Chrome is now willing to implement the protocol_handler for registering custom handlers via web extensions: w3c/webextensions#317 (comment)

Igalia is working on Chromium improvement to allow redirect-based handler registration with protocol_handlers in browser extension manifest. Completed the design document (The protocol_handlers Web Extension's Manifest key) with the permissions management and conflict resolutions, and shared the design document with Google for feedback.

Assuming that eventually lands in Chromium-based browsers, the next step will be to allow using the API for pointing protocol scheme at Service Worker, instead of URL. This will allows for keeping ipfs://cid in the address bar, and use it for Origin-based logic natively.

This will take even more time, for now the idea is tracked in: