aleph-im/pyaleph

Expose the IPFS daemon to Compute Resource Nodes linked to the CCN

Closed this issue · 4 comments

hoh commented

For scalability, we would like to keep the centralization of the network as small as possible.

I would therefore make sense that CRNs download files and volumes from the IPFS daemon of their linked CCN instead of fetching it from the default domain ipfs.aleph.im .

This requires the CCNs to expose their read-only API (port 8080 by default on Kubo) to the CRNs linked to them.

This could be achieved using firewall rules, reverse-proxy configuration, an authentication scheme or a VPN.

IPFS servers maintained by aleph.im should only serve to facilitate user uploads to the network and to serve static websites from aleph.im (ex: account.aleph.im) in the short term.

Using IPFS through an HTTP server feels like doing things the wrong way IMO. The whole advantage of using IPFS is that we can use P2P to fetch blocks in parallel from different servers. IMO starting an IPFS daemon on each CRN would make a lot more sense. We could of course bootstrap it with its parent CCN multiaddress, but any CCN should do.

hoh commented

I guess you mean "starting an IPFS daemon on each CRN".

My main concern about that is the extra load that the IPFS daemon would add to CRNs. The IPFS daemon currently uses permanently almost 1 CPU core on CCNs. Maintaining the DHT requires constant CPU and bandwidth usage. This load is already present on the CCN anyways (and will move to Storage Resource Nodes at some point).

In terms of latency, CCNs are expected to already pin the files required by CRNs, so they should be able to stream it directly to the CRNs and avoid doing slow DHT resolution.

Three important issues with your solution:

  1. This still produces a star-shaped network. What do you do if the CCN is down for maintenance?
  2. You'll still need to check the integrity of the volumes in most cases, meaning computing CIDs. The IPFS daemon already does that.
  3. The setup you suggest with firewall, reverse proxy rules, authentication etc requires a lot of new developments. None of that is necessary if we use the IPFS daemon directly, at the cost of a bit more resource usage on the CRN.

On the other hand, I see work on kubo to improve resource management and add resource constraints. I don't know how easy it is to achieve, but it should be possible to limit what the daemon on CRNs does to reduce CPU usage.

We set up an IPFS daemon on CRNs so I think this issue can be closed.