Syncing initial state or state after a crash where state is lost
cmeiklejohn opened this issue · 0 comments
cmeiklejohn commented
Migrating from lasp-lang/lasp_pg#13.
Anti-entropy isn't triggered immediately when a new node joins the cluster when using the state-based propagation backend. Therefore, it may take time before a node sees updates from other nodes in the cluster.
Reproducer:
- Server 1 starts up
- Server 1 adds Process 1 to a lasp_pg group
- Server 2 starts up
- Server 2 joins as peer
The issue becomes more problematic when dealing with a new or failed and recovering node with the delta-based propagation backend. Consider the following example:
- Server 1 starts up
- Server 2 joins
- Server 1 updates
- Buffers, sends deltas to server 2
- Server 2 acknowledges deltas
- Server 2 shuts down, crash failure (or, rejoins with a new identifier)
- Server 2 will receive no changes until the next change to that same data item -- nothing has been buffered for that node, nor if something was because it recovered with no disk, the buffer will be empty