ipfs/js-ipfs

A viable alternative to go-ipfs?

Closed this issue · 25 comments

I would very much like for the js-ipfs implementation to be the most widely used IPFS implementation and I'm interested to know what the main blockers are for that.

We realise you have a choice of IPFS provider and your choice in important to us.

Or in other words, why do you type ipfs daemon and not jsipfs daemon?

I've had some interesting insight already:

@olizilla:
it's running on a different api port, so i have to config things to use it
i've got a repo in ~/.ipfs that has some things in i want to share
and the command is jsipfs rather than ipfs
these are all things to make it easier to dev both, but they all make me feel like the go one is the one i should be running day-to-day
also there is like 10 people on the js dht and 800+ on the go one

@lidel
this may be a niche reason, but there is an official docker image with go version as well
it may be important for devs, because historically, i used different docker images to quickly find regressions in IPFS APIs used by companion having one for jsipfs would be nice

lidel commented

After giving it some additional thought, js-ipfs is missing:

Would a post install step to symlink ipfs -> jsipfs if ipfs isn't already taken be useful or just complicate things?

What about a ipfsd-ctl like cli ? we would have only one jpfs bin with the option to choose from go or js, we could even have it work like nvm and provide the version to run. Makes sense ?

lidel commented

I agree with @hugomrdias , CLI UX needs to be solved before we start playing with default symlink.

Perhaps it could be something like this:

USAGE
  ipfs provider ls      - print numbered list of installed providers (eg. go-ipfs, js-ipfs)
  ipfs provider set <x> - switch default provider (accepts a number from ls output or a path)

I've found another reason why people run go-ipfs instead of js one: debugging and metrics.
go-ipfs exposes various metrics and debug info on API port:

More reasons people pick go-ipfs:

  • Raw performance - the go-ipfs gateway can serve GBs of data per second, given appropriate hardware.
  • Dependencies - a statically compiled binary simply rocks. On the user end of go-ipfs there's just zero dependency issues. Just download the binary and get going.

My answer to this question in one image:

image

I'm really pumped by this goal! 🚀

vmx commented

In the meantime if you want your JS node to be more like a Go node:

#!/bin/sh

jsipfs config 'Addresses.API' '/ip4/127.0.0.1/tcp/5001'
jsipfs config 'Addresses.Gateway' '/ip4/127.0.0.1/tcp/8080'
jsipfs config 'Addresses.Swarm[0]' '/ip4/0.0.0.0/tcp/4001

I believe it is time to revisit this issue, not with the goal of creating a viable alternative for the sake of doing it, but focusing on what would it mean to move out of α into js-ipfs β. We should answer questions such as:

  • What degree of docs do we want?
  • Which APIs should we bless?
  • What features need to be in?
  • What features should go out?
  • s/jsipfs/ipfs ?
  • etc

What do folks think?

Just my two cents, but why try to mimic go-ipfs and make people want to type jsipfs daemon instead of ipfs daemon? Go is so much better suited for desktop/server usage than JavaScript, and nothing you can do will change the inherent tradeoffs from language choice (speed, the compiled binary thing, etc). If someone wants to run the ipfs daemon on their computer, shouldn't it be go-ipfs?

What really sets js-ipfs apart in my opinion is the ability to run it in the browser. It requires zero installation and config from the least tech-savvy person in the world. Go can't touch that, and thus go-ipfs doesn't have a chance of running on every computer in the world until it's bundled with OSes/browsers---which won't happen until browsers are running it anyways in their webpages/service workers as js-ipfs.

I think js-ipfs should double down on its strengths---getting into the hands of normal people and driving mainstream IPFS adoption via the browser---and let go-ipfs have its natural habitat on servers and desktop installs.

lidel commented

I agree with @npfoss: focusing on drop-in replacement for go-ipfs makes js-ipfs play a losing catch-up game and miss the opportunities provided by the web platform.

We had some informal discussions about reframing js-ipfs as web-first product (in past it has been Node-first, browser-second), and community seems to be really supportive of that direction.

FYSA there is already a huge body of work related to js-ipfs in browsers:

I really think we should double down on this.

I think our aim is to have IPFS and supporting technologies be a standard, as widely used as HTTP - to do that complete implementations in many languages need to exist. Ideally we wouldn't have to do this all ourselves (which is starting to happen, at least to the libp2p components) but there will be a certain amount of bootstrapping that needs to happen.

Developer adoption will be key to this - JS devs will want a JS implementation and Go devs will want a Go implementation. They can debug "their" implementation more effectively, it fits with their existing tooling and development processes and it's more likely to respect the conventions of their platform.

As long as they can discover each other & exchange data, I don't see why each can't leverage their own strengths.

The 'catch-up' game is part of this, but if we can agree what 'complete' means, perhaps we can limit it's scope to make it more achievable, then get on to all the cool stuff in @lidel's list. I think a lot of this is a question of prioritisation and resourcing.

We had some informal discussions about reframing js-ipfs as web-first product (in past it has been Node-first, browser-second), and community seems to be really supportive of that direction.

💯

At some point, we’re going to need to tackle the bundle size issues. There’s been work to incrementally bring this down over time, but more drastic changes to how js-ipfs and its dependencies are structured would be necessary to bring it down below 1mb. At some point, this actually conflicts with the goal of “feature parity,” or at least OOTB feature parity, with go-ipfs.

I’ve been exploring what IPLD looks like when paired down to run in a much more restricted bundle size. This has meant writing dependencies that don’t include Node.js polyfills like Buffer, not including codecs and having the user configure them, and not supporting hash functions other than SHA.

The way js-ipfs is structured is a bit of a “kitchen sink” approach. It’s a large project that comes with a very large set of functionality. That means a big bundle, and bringing that bundle down means shipping with less OOTB features and more user configuration of the exact features each user wants to include.

The 'catch-up' game is part of this, but if we can agree what 'complete' means, perhaps we can limit it's scope to make it more achievable, then get on to all the cool stuff in @lidel's list.

I resonate with this. In fact, we've done an exercise to identify what is IPFS Core multiple times (specially for go-ipfs as the scope increased a lot there too) and the conclusion is always the same: An IPFS implementation needs to be able to provide and retrieve files from the IPFS network, and from that you unpack that you need Bitswap, Magic Connectivity and Files Data Structures (i.e. IPLD).

What it does not require are things such as:

  • URLStore & FileStore
  • Fuse Mount
  • Any kind of notion for mutable content, including IPNS (there has been a discussion of moving IPNS into its own separate binary. However it is a huge bikeshed because users expect to be able to resolve mutable pointers directly from IPFS. IPNS could be a plugin).

Now, there are some things that should be priority such as: Anything that helps achieve the main goal of retrieving and providing files vs. GC and other utilities. This takes us to @lidel's list #1563 (comment)

There’s been work to incrementally bring this down over time, but more drastic changes to how js-ipfs and its dependencies are structured would be necessary to bring it down below 1mb

@hugomrdias can you list here the issues that track your work on reducing the bundle size? I know https://bundlephobia.com/result?p=ipfs@0.37.0 is linked from the README, but I recall an issue like an awesome endeavour that you were pursuing.

The way js-ipfs is structured is a bit of a “kitchen sink” approach. It’s a large project that comes with a very large set of functionality. That means a big bundle..

Thanks to @hugomrdias' work we actually can have a peek at what takes space through https://bundlephobia.com/result?p=ipfs@0.37.0

image

As you can see, the larger chunk are the Crypto things (node-forge 18.5%, tweetnacl 3.9%, libp2p-crypto 2.2%) that take 24.6% of the bundle. That is 344kB that are hard to reduce as the nodes need the crypto to communicate with the other nodes.

We also have things that are in progress to disappear with #1670, namely: readable-stream
4.8%, async 2.8%

Btw, I was surprised to see that multicodec and multihashes are so big, am I alone?

One other note, scanning through https://github.com/ipfs/js-ipfs/issues, I can't find any user of reporting the current gzipped bundle size being a deal breaker for them.

One more remark on the topic of making JS IPFS more Browser first. In addition to developing solutions to needs described in the issues listed by @lidel at #1563 (comment), we also need to effectively test to ensure that a JS IPFS node works fully on the browser, so that situations like #2093 stop being a surprise and rather something we now. More concretely, what I'm thinking is that tests should:

  • Spawn the multiple ever green browsers we want to support
  • Spawn multiple browser nodes and have them interact with each other (currently there are no tests that do this)
  • Create integration tests that assert that js-ipfs is fully capable of working with the Main Network.

This work doesn't fall within the categories of feature parity or bundle size reduction, it is about being able to be fully confident about the claims we want to make. Historically (when we started), spawning multiple browsers and create a tiny network was painful, today we have things such as https://github.com/GoogleChrome/puppeteer which should make things much more pleasant to build :)

As you can see, the larger chunk are the Crypto things (node-forge 18.5%, tweetnacl 3.9%, libp2p-crypto 2.2%) that take 24.6% of the bundle. That is 344kB that are hard to reduce as the nodes need the crypto to communicate with the other nodes.

Is it the case that we need crypto that isn’t part of WebCrypto to communicate with any node, or is it just that we have optionality in libp2p that needs additional crypto to communicate with some nodes and not others?

I ran into this w/ multihash already. Bundling all the possible hash functions is expensive but if you pair down to just SHA hashes you can get an incredibly small multihash implementation.

node-forge ships with damn near every transport you can think of (although I’m sure some of these don’t even survive the bundling process). It’s just not possible to produce small bundles when you include support for so many things by default. This is exactly what I was talking about when I said there’s a “kitchen sink” approach.

I think this may be what people mean when they say there has been a “node first” approach as opposed to a “browser first” approach. There’s not that much penalty for including a lot of optional behavior people only use occasionally in Node.js. In fact, it’s a much better developer experience to have these things “just work” without having to import or configure anything. But in the browser it’s quite expensive to include transports and hash functions the user may not even use, and it may be more acceptable to expect developers to import and configure these when they need them.

Browser testing

It would make sense to me not to invest in more complicated browser testing infrastructure until we first get coverage reports working for our browser tests. It’s not that I don’t think these more complicated tests would be useful, it’s that I don’t see how we would be able to measure and understand how useful they are without coverage. Puppeteer has coverage support and there’s a module for getting istanbul compatible coverage reports, so we can even overlay/combine the coverage between nodejs and the browser.

We might be able to make some of the more esoteric bits opt-in, we've talked about making IPFS modular a la libp2p before, but like David says, the issues people open aren't that the bundle is too big, it's that content/peers are hard/slow to discover and that they have problems understanding the API/functionality IPFS presents.

We should obviously strive to keep the bundle size down, but from the sort of issues people open, it's not the most pressing issue people are running into.

One additional thought.

If we have a large dependency chain that we only need once we reach specific network conditions, we could break that off into something we load dynamically when we reach that state. I don’t mean webpack code splitting here, I mean just loading a module we’ve published to CDN dynamically which wouldn’t cause code splitting in anyone’s bundle. Since this only gets triggered under a particular network operation, we’re already doing something async and can async load an additional set of modules. This could bring down the bundle size and only load certain behavior when necessary.

This would cause a mild performance penalty when the condition is met, but it’s probably worth it if it can dramatically increase the startup performance.

the issues people open aren't that the bundle is too big, it's that content/peers are hard/slow to discover and that they have problems understanding the API/functionality IPFS presents.

Fair enough, but we should keep in mind that we won’t hear from users who simply can’t use the software at all because of how they need to distribute and run the software they would like to build on it. The current bundle is too big for most mobile applications, too big for cloudflare workers, and due to other constraints mostly doesn’t work in Lambda and other serverless environments. That’s a fairly large amount of distribution we’re currently left out of and won’t hear from until something changes on our end or theirs.

Prior art #804

Just came here to double down on @lidel's comment #1563 (comment) ...to use js-ipfs in place of go-ipfs we'd need improvements on the observability and specifically the prometheus end point.

But! It looks like this thread has unearthed an emerging consensus that we want to define a subset of the IPFS api as essential, and make it work really well in the browser & node.

@daviddias's comment

Which APIs should we bless?
What features need to be in?
What features should go out?

@npfoss comment

Just my two cents, but why try to mimic go-ip

@lidel comment

I agree with @npfoss: focusing on drop-in replacement for go-ipfs makes js-ipfs play a losing catch-up game and miss the opportunities provided by the web platform.

@achingbrain

As long as they can discover each other & exchange data, I don't see why each can't leverage their own strengths. The 'catch-up' game is part of this, but if we can agree what 'complete' means, perhaps we can limit it's scope to make it more achievable, then get on to all the cool stuff in @lidel's list. I think a lot of this is a question of prioritisation and resourcing.

@mikeal

bringing that bundle down means shipping with less OOTB features and more user configuration.

Are we ready to define what features js-ipfs won't focus on? Which parts are optional, and could be extracted to seperate modules? Who's going to lead this charge?

js-ipfs is being deprecated in favor of Helia. You can #4336 and read the migration guide.

Please feel to reopen with any comments by 2023-06-02. We will do a final pass on reopened issues afterwards (see #4336).

This is super relevant to our current js-ipfs deprecation and something that @BigLep and @achingbrain should peek at before we close for good (i.e. i'm leaving open for now). Some things I think are important for you two to consider:

  1. Are there directions/requests in this we disagree with? Do we have an answer for that in Helia docs or do we need to make a blog post?
  2. Code splitting and piece-meal approach seems highly desired by a number of folks in this thread, I believe we're heading in the right direction with Helia. It's pretty isomorphic in my experience, but I haven't done any in-depth analysis.
  3. Browser first vs Node first - I think this is the wrong question. With ESM and the latest changes in JS (in the 5 years since this issue was opened) this question is no longer really relevant, and instead, I think we need to focus on ECMAScript support, regardless of runtime. We should expose browser specific, node specific, or other runtime-specific functionality as extensible/pluggable libs. I believe Alex is taking us in the right direction in being ECMAScript (or TypeScript/JavaScript) first.

There are already decent benchmarks showing that Helia can be a replacement for Kubo (go-ipfs's new name for those not aware) in some cases, and will probably be many more shortly.

For those interested, please help us make Helia even better!

You guys would know better than me, but it seems like the networking headaches alone make a Browser vs Node focus somewhat at odds -- it's not just about ECMAScript. Plus having a more intense focus on the core value add of Helia (living in the browser, where Kubo can't reach) just helps across the board on prioritizing what matters most to bring IPFS forward.

I recently came back to building on IPFS and have been disappointed that the best option is still to just use a gateway...

Just my 2c though, I'll be cheering for you regardless

I recently came back to building on IPFS and have been disappointed that the best option is still to just use a gateway...

@npfoss gateways will be the fastest for a while; at least until we get more webtransport (in kubo/iroh/other impls, as well as more browser support) and Helia nodes in the wild, but Helia (and latest libp2p updates) have improved the browser support of IPFS drastically.

If you are using a gateway, the trustless gateway fetching and block validation (one example of doing this in a recent PR I opened, https://github.com/ipfs/ipld-explorer-components/blob/1cf10d17d091eb52054b721fa54e4f7eba0e480d/src/lib/get-raw-block.ts#L105) is a great path forward.

Helia (and latest libp2p updates) have improved the browser support of IPFS drastically

glad to hear it :) sounds like things are really moving again

trustless gateway fetching and block validation

ooh cool, maybe I should switch over for the validation, even if it is still gateway backed

Per #1563 (comment) and the js-ipfs deprecation, I'm going to close this issue. This issue has been marked as a notable historical issue in #4336 . Feel free to repent by 2023-06-08 if there are critical open items on this topic that anyone wants to discuss.