Pubsub notes
Opened this issue · 6 comments
PubSub
- interfaces
- primitives to construct pubsub systems
- semantics
- propagation
- tree-forming
Different designs
- Multi-cast
- Duplex pipe: stream of updates
- Reliable pubsub?
Need to design Interface
- for modules
- for applications
PubSub means
- Event emission
- Network propagation
- Tree forming, grabbing all the nodes and create a tree that delivers messages efficiently
Lit. review
- Different type of pubsub
- Topic based pubsub: receive messages based on a key
- Hierarchical topic: cumulative messaging based on the path
- Type: tags
- Concept: conditional on topics
- Content-based: need predicates that need to be matched
- Topic based pubsub: receive messages based on a key
- Different areas of pubsub
- Forming a tree
- Given a tree/dag how do you propagate optimally?
- Expressing subscriptions (aggregation of subscription)
Action points:
- share list of papers @jbenet
cc @haadcode, @jbenet, @diasdavid
Snapshot from Etherpad
Collaborative systems (chat, editors, live streaming) - @haadcode
- Use cases: chat / communication systems, collaborative editors, live streaming
- topic based
- Javascript library preferred
- ipfs.sub('topic', (sender, message) => ...)
- 'topic' can include wildcards: 'topic-from-haad-*'
- ipfs.publish('topic', message)
- message could be a too
- latency: delivery in < 1ms ;)
- needs to be push (on subscriber side)
- topology: ideally all-to-all (distributed), would be ok with many-to-all (decentralized)
- links: redis has a solid interface (https://www.npmjs.com/package/redis#publish--subscribe)
Collaborative News Feeds/Boards (Reddit, News sites and so on)
- topology: some-to-many
- type: topic, content, type, concept (can be any) // type and content - subscription represented as a query, reuse IPLD with predicates, inspiration from XPATH pub-sub ??
- reliability: does not need to be a requirement //reliability/correctness related properties:- delivery guarantees, enforce ordering, allow gaps.
- realtime: not 100% required // efficiency-related properties:- timeliness, limitation of rate/period/length/size of infromation received
- encryption: important
- interface:
- subscriber:
- Be able to subscribe on one or more topics and validate that each message was generated by a trusted source // regular expressions over topic space, "content-based" condition/predicates on each level of namespace -- generic content-based queries/subscriptions over the real inner content require some kind of prior agreement on format (simpe text, XML ?, JSON, etc., lots of inspiration for the interfaces can be derived from XML/XPATH - though actually very inefficient.
- Subscribe and validate all the updates through the PubKey found in the IPRS record (https://github.com/ipfs/specs/tree/master/iprs-interplanetary-record-system)
pubsub subscribe --{topic, content, type, concept}="[...]" --auth=/iprs/QmHash
- publisher:
- Be able to join a 'ring of publishers'
- Publish under a key or several keys
- Simple example:
- Wordpress webpage // subscribe to all updates in setion / page / set of pages; subscribe to updates by person / set of people, to updates with given content (string) or specific update metadata iinformation
- Have several roles (Author/Editor/Subscriber), organize the network based on these roles (e.g Author link to Editor and subscribers, subscribers to subscribers)
- Converge the news feeds through CRDT (offload that responsability from the pubsub protocol)
// stock ticker based on topic (market / stock domain / companu names), based on content (e.g., variation over X% )
// betting system warning system based on domain / competition / team, etc. and based on content (odds over X, odds raising over X%, etc.)
- subscriber:
Listening on IPNS updates (@nicola)
- Description: I want to be able to get updates on the new hash that an IPNS key is signing,
When there is a new update (that invalidate the previous one) and the owner signs it,
The owner should be able to propagate the update to the rest of the network
Even if the owner does not propagate to the rest, if users fetch it and they have subscribers, they should be able to propagate it - Properties:
- Can skip messages (unreliable)
- No need of ordering (only the latest one really counts!)
- Q: which one is the latest though? (causal ordering!? to be defined)
- Little flooding through the network
- Does not need to persist beyond IPNS expiration time
- On new messages, we can just skip gossiping/emitting the previous ones
- In essence: replaces the DHT via a lazy pub/sub, you only get updates when peers connected to you receive an IPNS update
- API:
- Subscriber
- .sub(HASH) (we could actually make this hierarchical with IPLD pathing HASH/friends/0/blog)
- .unsub(HASH)
- Publisher
- .pub(HASH, content) (no need to re-publish for all the hierarchical changes since .pub should be clever enough)
- Subscriber
Virtual reality (games)
Creat shared multiplayer environments, potentially thousands of users.
Similar requirements to chat, but events are more frequent and smaller
- Interface: Library
- Ideally something that works in JavaScript/browser via WebRTC
- Pushx
- Persistent (maybe?)
- Low latency
- Ability to prioritize events (some inputs can be discarded, user knows priority)
- Complex filtering (users only need to track events based on their location)
- Ordering of events can be causal
Analyitcs service
- Subsystem that listens to messages sent from a cluster of nodes pertaining to particular types of events or metrics. Has some ability to alert based on particular thresholds for a metric. Displays graphs/visualizations.
-Ideal interface:- func publish(topic, message) error
- func read(topic) (message, error)
Properties:
- Persistant queues of messages per topic, but of bounded length if no reads occur, futher older messages may be cleared.
- Effortless topic creation (there's no create topic or other topic management on the publisher side)
Speaking of papers and the above-referenced issue, here's a link to the pdf that reviews pub/sub systems.
Also, (aside from the livestream) did anyone manage to capture video of this discussion?
@gavinmcdermott I, too, am curious about a video of the discussion
I'm interested in watching the discussion video as well
Has any thought been given to the issue of ethereal vs. persistent in the context of collaboration via pub-sub on IPFS? I.e. is it efficient to capture every fine-grain update from every node into the merkle tree just to enable collaboration? From a distributed data model state transition (i.e. collaboration) perspective, this seems sub-optimal for highly dynamic data.
The holy grail is a unified interface to a local/remote model, allowing an app to deal with data in a way that's abstracted from locality. The key seems to be to enable triggering serialization of ethereal to persistent at appropriate times. I.e. application accesses data in a local paradigm, PubSub syncs updates in an optimal (ethereal --> networked p2p but non-persistent) way, then at some point the 'transaction is committed' which triggers serialization of current state down to IPFS.
A key thing here is a collaborative proxy interface exposed to app layer needs to be async and type aware. Something similar to this https://github.com/tycho01/proxy-dsl
Hope this is constructive & makes sense. Let me know if not welcome.
@Dave--G Good points. Might be good to raise those issues here, https://github.com/libp2p/pubsub, though, as this issue is largely just for the notes for this event.