/py-libp2p

The Python implementation of the libp2p networking stack 🐍 [under development]

Primary LanguagePythonOtherNOASSERTION

py-libp2p Build Status codecov Gitter chatFreenode

py-libp2p hex logo

WARNING

py-libp2p is an experimental and work-in-progress repo under heavy development. We do not yet recommend using py-libp2p in production environments.

Sponsorship

This project is graciously sponsored by the Ethereum Foundation through Wave 5 of their Grants Program.

Maintainers

The py-libp2p team consists of:

@zixuanzh @alexh @stuckinaboot @robzajac

Development

py-libp2p requires Python 3.7 and the best way to guarantee a clean Python 3.7 environment is with virtualenv

virtualenv -p python3.7 venv
. venv/bin/activate
pip3 install -r requirements_dev.txt
python setup.py develop

Testing

After installing our requirements (see above), you can:

cd tests
pytest

Note that tests/libp2p/test_libp2p.py contains an end-to-end messaging test between two libp2p hosts, which is the bulk of our proof of concept.

Feature Breakdown

py-libp2p aims for conformity with the standard libp2p modules. Below is a breakdown of the modules we have developed, are developing, and may develop in the future.

Legend: 🍏 Done   🍋 In Progress   🍅 Missing   🌰 Not planned

libp2p Node Status
libp2p 🍏
Identify Protocol Status
Identify 🍋
Transport Protocols Status
TCP 🍏
UDP 🍅
WebSockets 🌰
UTP 🌰
WebRTC 🌰
SCTP 🌰
Tor 🌰
i2p 🌰
cjdns 🌰
Bluetooth LE 🌰
Audio TP 🌰
Zerotier 🌰
QUIC 🌰
Stream Muxers Status
multiplex 🍏
yamux 🍅
benchmarks 🌰
muxado 🌰
spdystream 🌰
spdy 🌰
http2 🌰
QUIC 🌰
Protocol Muxers Status
multiselect 🍏
Switch (Swarm) Status
Switch 🍏
Dialer stack 🍏
Peer Discovery Status
bootstrap list 🍅
Kademlia DHT 🍋
mDNS 🌰
PEX 🌰
DNS 🌰
Content Routing Status
Kademlia DHT 🍋
floodsub 🍏
gossipsub 🍏
PHT 🌰
Peer Routing Status
Kademlia DHT 🍏
floodsub 🍏
gossipsub 🍏
PHT 🌰
NAT Traversal Status
nat-pmp 🌰
upnp 🌰
ext addr discovery 🌰
STUN-like 🌰
line-switch relay 🌰
pkt-switch relay 🌰
Exchange Status
HTTP 🌰
Bitswap 🌰
Bittorrent 🌰
Consensus Status
Paxos 🌰
Raft 🌰
PBTF 🌰
Nakamoto 🌰

Explanation of Basic Two Node Communication

Core Concepts

(non-normative, useful for team notes, not a reference)

Several components of the libp2p stack take part when establishing a connection between two nodes:

  1. Host: a node in the libp2p network.
  2. Connection: the layer 3 connection between two nodes in a libp2p network.
  3. Transport: the component that creates a Connection, e.g. TCP, UDP, QUIC, etc.
  4. Streams: an abstraction on top of a Connection representing parallel conversations about different matters, each of which is identified by a protocol ID. Multiple streams are layered on top of a Connection via the Multiplexer.
  5. Multiplexer: a component that is responsible for wrapping messages sent on a stream with an envelope that identifies the stream they pertain to, normally via an ID. The multiplexer on the other unwraps the message and routes it internally based on the stream identification.
  6. Secure channel: optionally establishes a secure, encrypted, and authenticated channel over the Connection.
  7. Upgrader: a component that takes a raw layer 3 connection returned by the Transport, and performs the security and multiplexing negotiation to set up a secure, multiplexed channel on top of which Streams can be opened.

Communication between two hosts X and Y

(non-normative, useful for team notes, not a reference)

Initiate the connection: A host is simply a node in the libp2p network that is able to communicate with other nodes in the network. In order for X and Y to communicate with one another, one of the hosts must initiate the connection. Let's say that X is going to initiate the connection. X will first open a connection to Y. This connection is where all of the actual communication will take place.

Communication over one connection with multiple protocols: X and Y can communicate over the same connection using different protocols and the multiplexer will appropriately route messages for a given protocol to a particular handler function for that protocol, which allows for each host to handle different protocols with separate functions. Furthermore, we can use multiple streams for a given protocol that allow for the same protocol and same underlying connection to be used for communication about separate topics between nodes X and Y.

Why use multiple streams?: The purpose of using the same connection for multiple streams to communicate over is to avoid the overhead of having multiple connections between X and Y. In order for X and Y to differentiate between messages on different streams and different protocols, a multiplexer is used to encode the messages when a message will be sent and decode a message when a message is received. The multiplexer encodes the message by adding a header to the beginning of any message to be sent that contains the stream id (along with some other info). Then, the message is sent across the raw connection and the receiving host will use its multiplexer to decode the message, i.e. determine which stream id the message should be routed to.