2 hop bitswap
alanshaw opened this issue · 5 comments
Or as I've affectionally named it: bitswaxx which is like bitswap++ concatenated and alpha'd.
I had this idea which is a cross between delegated routing and the "friends of friends" policy that scuttlebutt has. This is likely to highlight my inexperience in network/protocol design and it's more than likely to cover ground and research already discounted...but, if this triggers ideas in someone else, or can be adapted somehow then it will make writing this down worthwhile.
The idea is, get others to fetch content for you, on the bitswap level. In short, your peer's wantlist becomes your wantlist.
Imagine this - if I have 10 peers, and each of those 10 peers has 1,000 peers then I've increased the scope of peers who might have the content I'm looking for from 10 to 10,000.
This comes in 2 flavours:
-
These peers are my friends, their wantlist is my wantlist
We can't just concat ALL wantlists together, because the world will converge on a single dataset and we'll ALL end up storing each other's stuff. Hard disks explode - obvious problem.
We could do this for peers we deem to be "friends". When we receive their wantlist, it gets added to our wantlist until such time as our friend stops wanting it or we receive it. If we start wanting it ourselves then that trumps the friend want.
It could allow actual IRL friends to host each others content or allow a public node(s) to befriend all nodes in a particular web/mobile/desktop application.
-
Hop based wants
e.g. "wants" <= 2 hops away are added to my wantlist.
The hops are configurable. Bitswap is 1 hop based. Bitswaxx would be 2+ hop based. Each "want" would carry a
hop
field - a number that describes how far away the want came from, and is incremented by 1 on receipt.A want desired by your node is sent with a
hop
of 0. A want that came from another peer would be sent with whatever the receivedhop
was + 1. We'd have to decide what to do with duplicate wants with differing hop counts (take the lowest?).Provided the
hop
count for a want you receive is <= 2 (for example) the want will be added to your wantlist. This is of course until such time as the peer who wanted it stops wanting it or we receive it. If we start wanting it ourselves then that trumps the want from the peer.
Known issues/open problems:
- Not intelligent
- Arguably, peers of peers are no more likely to have the content you want than the peers you have and it would be better to intelligently converge on a peer who actually has that content.
- This is what the DHT does and it is known to be slow. Bitswaxx might be an effective in-between method that's slower than bitswap but faster than DHT.
- The more peers we can ask for content the more likely we'll find it. Right? We can't actually connect to 10,000 peers simultaneously, but we "can" if we leverage our peer network
- Arguably, peers of peers are no more likely to have the content you want than the peers you have and it would be better to intelligently converge on a peer who actually has that content.
- Disk space
- Storing content (albeit temporarily) will have a considerable disk impact
- Bandwidth
- Bitswap will become considerably more chatty
- Multiple peers will receive content they don't want, and may never provide it to the peer that does
- Abuse
- As with all relays this is open to abuse
- We're trusting the
hop
field
- Complicated
- If we want to be able to remove the want from our list (because we don't want it ourselves) we need to track which peers want what - memory issues
Related:
IPFS is supposed to be able to operate in "proxy" mode. That is, bitswap should be able to forward wantlist requests. We don't currently support this but that would cover the caching case quite well.
Really, we desperately need this feature for mobile support.
That whole thread is pretty great, though is focused on http fallbacks and content routing ideas rather than bitswap hops.
- The concept of "friends" would also be useful for keeping connections alive, bootstrapping, etc.
- In many cases, we can bypass disk and just keep these blocks in memory (for a very short period of time).
- Instead of returning blocks, we could send back the peer ID of the node that has the content (although that's more delegated routing than anything).
I like and would support these directions. Both of these approaches will quite likely reduce content resolution time (although more bitswap signalling will again put stress to the system), but I'm not sure if this reduction will be significant.
Ultimately, the problem with doing either of these (hop-based or friend of friend-based) is that it's still happening at the overlay layer. This means that someone who is 1-hop away on the overlay network might be tens of hops away in actual network hops (which will induce delays). This might be ok in the short-term, but is certainly hugely sub-optimal in the long-term.
However, both of the approaches will have much much bigger impact in terms of performance if in the future we design the architecture to support:
- topic-based DHTs: where it's much more likely to have "common interest/wantlist items" with friends and friends of friends.
- topologically embedded routing delegate nodes, where the delegates are chosen such that they're also geographically close (and hence less network hops away) and therefore the system would take advantage of locality of interest.
Have any of the above been discussed before? Is there interest to think more along those lines?
- The concept of "friends" would also be useful for keeping connections alive, bootstrapping, etc.
@Stebalien Thank you for the idea ... I've brainstormed after reading it - well it escalated a bit... 🤔
Just passing by to let you know that we have just published some initial results for our "jumping bitswap" prototype inspired by @alanshaw's "hop based wants" idea: https://research.protocol.ai/blog/2020/teaching-bitswap-nodes-to-jump/
Let us know what you think about it and if you have additional suggestions, feedback or ideas :)