ipfs/js-ipfs

Improving init (aka, how do I get a js-ipfs node instance!)

daviddias opened this issue ยท 27 comments

There has been a lot of feedback when it comes to the API for creating an IPFS node instance, some feedback from myself included. It is true, the process is a bit cumbersome, instead of having something beautiful as

// # Option 1
const options = {
  init: true
  online: true
}
const node = new IPFS(options)

We have:

const repo = <IPFS Repo instance or repo path>
const ipfs = new IPFS(repo)
ipfs.init({ emptyRepo: true, bits: 512 }, (err) => {
   if (err) { throw err }
   ipfs.load((err) => {
     if (err) { throw err }
     ipfs.goOnline((err) => {
       if (err) { throw err }
       // finally something you can work with!
   })
})

Yes, it is not nice, but why is it this way right now? Why can't it be sync, well for some reasons:

  • init needs to get the repo started if it doesn't exist, since IPFS Repo use pull-blob-store adapters and these are async (because a Repo can be a network storage), this call needs to be async
  • load is when we take a repo and load all the config values. This also can't be sync because we only store the Private Key on the config and the generation of the PublicKey needs to be async (Thanks WebCrypto! see more at: #485))
  • goOnline this is where we turn on bitswap, it means that we are ready to fetch and serve blocks from and to other nodes.

This can be way more nicer, however, starting from documentation of this methods, examples and explaining what the functions are doing inside the readme.

Another thing to consider is making the IPFS instance an event emitter with a 'ready' event, so that Option 1 can be achieved by loading everything as options and emitting a ready event.

Other things to consider

  • It would be sweet if the options passed to init or to new IPFS(options) would also accept values to overload the config with. Right now we always create the repo and then use the config API, it works but this option would reduce the steps, making code easier to read.
  • goOnline should be bitswap.on and bitswap.off, as we will need soon to be able to enable and disable things like the dht.

I would love to hear everyone's thoughts on this, I might be missing something that might get us to a even nicer start process, looking forward for feedback :) Thank you!

I have seen apis where it is done in this way:

const ipfs = new IPFS(options)

ipfs.on('ready', () => {
  // We are ready to do things
})

ipfs.on('error', (err) => {
  console.error('failed to start', err)
})

These event listeners could also be attached through options for convenience similar to nodes http server

const ipfs = new IPFS(options, () => {
  console.log('WE ARE READY')
})

exactly @dignifiedquire, that goes along the lines of what I was thinking with Option 1 + ready event, it feels like something familiar (like a TCP listener or a HTTP server in Node.js).

๐Ÿ‘ for the events pattern. This would allow us to have more granular events to the state if needed, like connected etc. Another plus side is that if we do this, we can have the same API for js-ipfsd-ctl (which I wrapped in ipfs-daemon module) and allows to converge the native vs. browser "give me daemon" libs, and use the "browser" field in package.json to decide separate the two.

I really like this direction!

Oh nice @haadcode! I was not even thinking that we could do that too. I'm getting even more excited :D

This can be way more nicer, however, starting from documentation of this methods, examples and explaining what the functions are doing inside the readme.

For the README, the rule of don't use too many inline comments because it messes up how easy it is to work on the code doesn't apply. It might be a good idea to have some of the comments you have above inlined in the code example; that way, people can look at the internal functions and understand what each step is doing.

@diasdavid @dignifiedquire I would like to work on this in the next couple of weeks (unless someone else has the time earlier) and come up with a module that does what has been discussed above?

I'm thinking we can wrap this into a module that can return either js-ipfs daemon or js-ipfs-api daemon. I already have ipfs-daemon module that does this for js-ipfs-api, would you be ok using that as the name?

@haadcode I'm not sure we want this in another module, I'd rather have ipfsd-ctl and js-ipfs natively supporting this same api, with a shared test suite maybe. And then you install the one you need.

@haadcode you are welcome to take the lead, but it is better to converge efforts, currently we have:

Which, IMHO, should just be one module that offers the same API that js-ipfs has to create an instance. Also, IMHO I would converge them all to js-ipfd-ctl, instead of creating yet another one.

I'm not super sure if we have to have another module (your proposal of ipfs-daemon) that creates either a js-ipfs or js-ipfsd-ctl (go-ipfs) daemon. I would prefer (x100) that we grab the IPFS Factory that has been created for js-ipfs (https://github.com/ipfs/js-ipfs/tree/master/test/utils/factory-core) and js-ipfs-api (https://github.com/ipfs/js-ipfs-api/tree/master/test/factory) and expose that as a module that can generate instances of both, this will be the ideal module for testing.

IPFS Factory does exactly the same for js-ipfs as ipfs-daemon does for js-ipfs-api. I would much rather have one module that works for both (Node.js and Browser), as opposed to having to choose which module to install (though, I'm not 100% sure if this works the way I imagine it to work).

I'll play around with this and see how it could work nicely.

IPFS Factory also exists for js-ipfs-api and it has a feature that a browser can spawn a go-ipfs daemon too by calling to the IPFS Factory API, without having to change a line in the code :)

(that's a good feature to have)

Let's wrap that into a module then and go from there?

Does IPFS Factory support non-disposable daemons now? Last time I checked it, it didn't.

@haadcode tagging you, as you volunteer, to lead this effort and the revamp of ipfsd-ctl.

Main goal: have a canonical way to spawn js-ipfs in process daemons, js-ipfs remote daemons (using js-ipfs-api to contact them), go-ipfs remote daemons (using js-ipfs-api to contact them) and provide with features to spawn several of ephemeral nodes, which are great for testing.

Relevant threads:

Yeah, I can take this around beginning of December. If anyone in the meanwhile would like to work on this, please do!

Some thoughts here:

  • This is a lot of work.
  • We're basically refactoring everything in order to support all the feature requests in the issues linked above by @diasdavid.
  • I'm not a huge fan of the name ipfsd-ctl. I would prefer ipfs-daemon, much easier to understand what it does based on what it says on the tin.
  • I'm uncertain how exactly all of this would come together, as js-ipfs and ipfsd-ctl + js-ipfs-api work very differently atm
  • I'm cautious about js-ipfs remote daemons (using js-ipfs-api to contact them) @diasdavid, isn't this a big security risk? I understand the need for this, and agree it's great for testing, but I'm concerned we open a big security hole here.
  • Did I mention this is a lot of work?
  • ...and because of that, perhaps it would make more sense to break the workload into more manageable chunks and add a new feature one at a time?

That all said, I'm very excited to get all this done! :) What we have now has worked ok-ish so far but given our progress, growth, etc. all these features and fixes make a lot of sense.

I'm cautious about js-ipfs remote daemons (using js-ipfs-api to contact them) @diasdavid, isn't this a big security risk? I understand the need for this, and agree it's great for testing, but I'm concerned we open a big security hole here.

How is this a security risk bigger or smaller than js-ipfs-api + go-ipfs?

Did I mention this is a lot of work?

It is, agree on having manageable chunks.

@diasdavid Ah! I understand now what you mean. This is obviously when running js-ipfs in Node.js :) Ignore that comment, then!

I'm not a huge fan of the name ipfsd-ctl. I would prefer ipfs-daemon, much easier to understand what it does based on what it says on the tin.

I would even take it a step forward and call it ipfs-node-factory, simply because a daemon is a detached process running in the background (which is not true while running js-ipfs in the browser or in proc) and this module should enable to spawn 1 to n daemons ad hoc, ephemeral or not.

The API proposed here is being prototyped in this repo: https://github.com/haadcode/ipfs-daemon. The idea is not to create a new module but to figure out the optimal API design and move the code from ipfs-daemon to their relevant existing ipfs modules.

@dignifiedquire proposed in yesterday's call that we should have an explicit start() method. I would agree with that.

The proposed usage would be:

const ipfs = new IPFS()
ipfs.on('ready', () => ...)
ipfs.start()

Pros

  • Makes it possible to add steps before starting the daemon
  • Technically makes the code behave correctly (if we do new Ipfs() that emits the ready event, then the ready event could be emitted before it's been subscribed to)

Cons

  • Adds an extra command to call to start the daemon/instance making it slightly less automagic

The start command is effectively the current goOnline, as you can see in the initial proposal -- #556 (comment) --, this step can be also a configure option {online/start: true}.

I agree on keeping a general 'boot this thing up, damn it! :D' function, nevertheless, we should include mechanisms to turn network services on an off, as described on the initial proposal too, with the following comment:

goOnline should be bitswap.on and bitswap.off, as we will need soon to be able to enable and disable things like the dht.

This will be super useful for a lot of cases (and testing!)

is "how do I get a js-ipfs node instance" already possible with current js-ipfs ?

@thisconnect the setup described at the very top works with the current version.

@dignifiedquire thanks for the quick reply and also your answer here #771 (comment)

About this issue, is API still open for discussion?
What do you guys think about using native Promises?
(Users on Node.js can easily polyfill promises with the implementation of their choice)

i.e.

IPFS(options)
.then(ipfs => console.log(ipfs.isOnline()))
.catch(err => console.error(err))

which would/should then work with async/await as well

@thisconnect we do have a ongoing discussion about the exposed interface over here: #557

And worth mentioning, today we're exposing both callbacks and promises via promisify. You can see an example of that over here:

replace: promisify((config, callback) => {
self._repo.config.set(config, callback)
})

Basically, what that means is that we're assuming you're using promises if you don't provide a callback to the function. So func((res) => {console.log(res)}) would work the same way as func().then((res) => {console.log(res)}) today already.

Ok, grabbing this issue again, lot's of discussion have happened with tons of valuable input. I'm going ahead and attempt to coalesce all of the ideas into one proposal in which I'll be implementing in the next couple of days. This will be the time where feedback is more crucial than ever, as I'll have to move forward with this pretty quickly.

New Init API

Init'ing an IPFS instance will look like this:

const IPFS = require('ipfs')

config options = {
  repo: <repoPath or instance>,
  init: true, // defaults to true if repo is not init'ed yet
  // other option
  // init: {
  //  bits: 1024
  // }
  start: true, // defaults to true
  EXPERIMENTAL: {
    // enable experimental features
  },
  config: {
     // overload config options
  }
}

const node = new IPFS(options)

node.on('error', (err) => {
   // error can be:
   //   - network error (libp2p didn't manage to boot up the transports)
   //   - init was set to false but the repo didn't exist
   //   - some other calamity happened
})

node.on('start', () => {
  // node is running
})

node.on('stop', (err) => {
  // node is now stopped
})

// node.stop(<optional callback, fires at the same time as the event 'stop'>)
// node.start(<optional callback, fires at the same time as the event 'start'>)

Satelite modules ipfsd-ctl vs ipfs-daemon vs ipfs-factory

We need a thing that lets us spawn nodes (js-ipfs and go-ipfs) easily, so that apps like orbit, our tests and benchmarks can switch between different combinations (js-ipfs, js-ipfs daemon + js-ipfs-api, go-ipfs + js-ipfs-api) without too much configuration.

This endeavor doesn't have to be part of "Improving Init", but it would be definitely useful to a lot of contributors, users and even for our testing.

Currently, we have:

  • ipfsd-ctl - spawns go-ipfs daemons and returns a js-ipfs-api instance to talk with them
  • ipfs-factory (not a module, just a tool within this repo) - spawns daemons from Node.js and the Browser.
  • ipfs-daemon - https://github.com/haadcode/ipfs-daemon

ipfsd-ctl is the one that has more adoption.

All right, work happening here! #790