bakwc/PySyncObj

Multiple leaders

cnnrznn opened this issue · 1 comments

I've noticed the SyncObj code is riddled with:

# WARNING: there may be multiple leaders

The raft protocol in theory guarantees at most one leader. Has anyone observed the opposite in practice?

The Raft protocol only guarantees that there is at most one leader within a given term. It does not guarantee that there is only one leader at a particular moment in time.

While I haven't done any real-world tests for this, it's easy to imagine an example when that could be the case: for example, the leader for term T could get partially segmented from the rest of the network and become able to reach only a minority of the cluster. One of the remaining nodes times out and gets elected new leader at term T+1, but until the leader for T gets a message back from one of the nodes it can still talk to, it will not know about this. During that time window, there would be two leaders (but for different terms).

These periods with multiple leaders should be short (at most one leader timeout) and rare (assuming your network is sufficiently stable), but they can occur. If it's important for the functioning of the cluster that there is only one "leader" at a time, you need to use a distributed lock.