About implementation of logs being caught up
Closed this issue · 4 comments
A quote from chapter 6 of raft:
To avoid availability gaps, Raft introduces an additional phase before the configuration change, in which the new servers join the cluster as non-voting members (the leader replicates log entries to them, but they are not considered for majorities).
And then, the current implementation as follows:
- Add non-voting server to the configuration by
AddNonVoter
- Dispatch configuration changelog entry to old followers
- Update the latest configuration
- Starting caught up
But, from the current implementation, it seems that the server is no becoming voter after catching up.
Finally, there is a question as follow:
- How to known it has caught up logs?
From what I remember AddVoter
actually does what you want. It should start as a NonVoter and automatically change to Voter
once it's caught up. (You should confirm this though, as I'm not 100% sure.)
@JelteF It seems to be a todo.
AFAIK, the AddVoter
will add the given server to the cluster and assign it a vote:
func nextConfiguration(current Configuration, currentIndex uint64, change configurationChangeRequest) (Configuration, error) {
// ...
configuration := current.Clone()
switch change.command {
case AddStaging:
// TODO: barf on new address?
newServer := Server{
// TODO: This should add the server as Staging, to be automatically
// promoted to Voter later. However, the promotion to Voter is not yet
// implemented, and doing so is not trivial with the way the leader loop
// coordinates with the replication goroutines today. So, for now, the
// server will have a vote right away, and the Promote case below is
// unused.
Suffrage: Voter,
ID: change.serverID,
Address: change.serverAddress,
}
// ...
case AddNonvoter:
// ...
}
// ...
return configuration, nil
}
func (r *Raft) quorumSize() int {
voters := 0
for _, server := range r.configurations.latest.Servers {
if server.Suffrage == Voter {
voters++
}
}
return voters/2 + 1
}
IMHO, the non-voting server should be added to the cluster by AddNonvoter
, and then to be automatically promoted to Voter later.
We recommend using https://github.com/hashicorp/raft-autopilot to get this functionality. It will add nodes as non-voter and then, after they've been "stable" for long enough and are keeping up, they'll be promoted to voters.
Happy to re-open if this doesn't help or if you still have questions.