Leader stepping down with membership change
jabolina opened this issue · 2 comments
Our election algorithms build on top of JGroups' view changes. Removing the leader through membership operations only changes the Raft cluster members, but the view is unchanged.
A leader removing itself does not step down and can still replicate operations! After the leader is removed and the operation committed, the leader needs to step down, and an election round needs to be initiated.
I still need to investigate how to address this. The Raft dissertation §3.10 has a leader transfer extension, which might give us some hints. The solution should not affect the election mechanism by view changes.
In the Raft dissertation, §4.2.2 describes removing the current leader. The suggestion is to utilize the leadership transfer extension. We carry with the membership operation and the leader steps down after it is committed. This is necessary to make progress, it could be difficult to step down and then remove the node, as it would require an election to happen first.
Going through the leader transfer mechanism in §3.10, I do not find it interesting to include. Quoting: "we have not currently implemented or evaluated this leadership transfer approach.". However, we can utilize some of the ideas.
A node can only be a leader if it has an up-to-date log. Since the membership operation also goes through consensus. After it is committed, we are sure we have nodes with up-to-date logs. We could start an election round on the current leader and utilize only the current member list to restrict.
The tricky part is handling the pending requests, where we might have pending requests during the membership operations. The simplest solution would be to complete everything exceptionally. The complex solution would be to make the current leader enqueue requests and redirect everything after a new leader is elected.
The latter would require changes mostly to the REDIRECT
protocol. We could return a specific error code to retry.