hashicorp/raft

Rejoining cluster with same node causes "raft: rejecting vote request since node is not a voter"

Schachte opened this issue · 4 comments

I have a simple layout with Raft, 2 followers and a leader.
I bootstrap the cluster with the leader and add 2 follower Raft nodes to the Raft cluster, things look fine. I knock the leader offline and one of the followers promotes to leader after the election.

Now when I bring the previous leader back online, I get:

raft: rejecting vote request since node is not a voter: from=127.0.0.1:5000

The thing is, all the nodes are voters. I made an HTTP wrapper around the raft stats for each node and they all contain the following:

{
  "applied_index": "0",
  "commit_index": "0",
  "fsm_pending": "0",
  "last_contact": "never",
  "last_log_index": "4",
  "last_log_term": "2",
  "last_snapshot_index": "0",
  "last_snapshot_term": "0",
  "latest_configuration": "[{Suffrage:Voter ID:node_1 Address:127.0.0.1:5000} {Suffrage:Voter ID:node_2 Address:127.0.0.1:5100} {Suffrage:Voter ID:node_3 Address:127.0.0.1:5200}]",
  "latest_configuration_index": "0",
  "num_peers": "2",
  "protocol_version": "3",
  "protocol_version_max": "3",
  "protocol_version_min": "0",
  "snapshot_version_max": "1",
  "snapshot_version_min": "0",
  "state": "Candidate",
  "term": "76"

https://github.com/hashicorp/raft/blob/main/raft.go#L1532
I see this code, I just don't get how it's possible.

Any ideas?

Hi @Schachte,

Is this problem reproducible, and if so, can you provide more detailed steps? Can you link to the code that uses raft?

stale commented

Hey there, This issue has been automatically closed because there hasn't been any activity for a while. If you are still experiencing problems, or still have questions, feel free to open a new one :+1

mrpre commented

I met the same issue.
raft version v1.3.9
1 leader(node2) and 2 followers(node1 and node3).
when I stop node3 then restart it after 20s with original serverid and data, the node3's log show:

2022-06-27T18:47:50.920+0800 [INFO]  raft@v1.3.9/api.go:588: initial configuration: index=4 servers="[{Suffrage:Voter ID:node1 Address:11.164.2.90:7001} {Suffrage:Voter ID:node2 Address:11.158.128.78:7002} {Suffrage:Voter ID:node3 Address:11.165.57.57:7003}]"
httpserver: 2022/06/27 18:47:50 join cluster
2022-06-27T18:47:50.920+0800 [INFO]  raft@v1.3.9/raft.go:157: entering follower state: follower="Node at 11.165.57.57:7003 [Follower]" leader-address= leader-id=
httpserver: 2022/06/27 18:47:50 started
2022-06-27T18:47:52.370+0800 [WARN]  raft@v1.3.9/raft.go:239: heartbeat timeout reached, starting election: last-leader-addr= last-leader-id=
2022-06-27T18:47:52.370+0800 [INFO]  raft@v1.3.9/raft.go:282: entering candidate state: node="Node at 11.165.57.57:7003 [Candidate]" term=3
2022-06-27T18:47:53.776+0800 [WARN]  raft@v1.3.9/raft.go:382: Election timeout reached, restarting election
2022-06-27T18:47:53.776+0800 [INFO]  raft@v1.3.9/raft.go:282: entering candidate state: node="Node at 11.165.57.57:7003 [Candidate]" term=4
2022-06-27T18:47:54.907+0800 [WARN]  raft@v1.3.9/raft.go:382: Election timeout reached, restarting election
2022-06-27T18:47:54.907+0800 [INFO]  raft@v1.3.9/raft.go:282: entering candidate state: node="Node at 11.165.57.57:7003 [Candidate]" term=5
2022/06/27 18:47:55 state ./node3 Candidate
2022-06-27T18:47:56.129+0800 [WARN]  raft@v1.3.9/raft.go:382: Election timeout reached, restarting election
2022-06-27T18:47:56.129+0800 [INFO]  raft@v1.3.9/raft.go:282: entering candidate state: node="Node at 11.165.57.57:7003 [Candidate]" term=6
2022-06-27T18:47:57.381+0800 [INFO]  raft@v1.3.9/raft.go:157: entering follower state: follower="Node at 11.165.57.57:7003 [Follower]" leader-address= leader-id=
2022-06-27T18:47:57.413+0800 [WARN]  raft@v1.3.9/raft.go:1408: failed to get previous log: previous-index=5 last-index=4 error="log not found"
2022/06/27 18:48:00 state ./node3 Follower
2022-06-27T18:48:04.908+0800 [ERROR] raft@v1.3.9/raft.go:1830: failed to make requestVote RPC: target="{Voter node1 11.164.2.90:7001}" error="read tcp 11.165.57.57:44544->11.164.2.90:7001: i/o timeout"
2022/06/27 18:48:05 state ./node3 Follower
2022-06-27T18:48:06.165+0800 [ERROR] raft@v1.3.9/raft.go:1830: failed to make requestVote RPC: target="{Voter node1 11.164.2.90:7001}" error="read tcp 11.165.57.57:44546->11.164.2.90:7001: i/o timeout"

the leader who process AddVoter on node2 become Follower and node1 become leader
I have add debug.PrintStack in setState and the leader log show like this

2022-06-27T18:47:50.970+0800 [INFO]  raft@v1.3.9/raft.go:1135: updating configuration: command=AddVoter server-id=node3 server-addr=11.165.57.57:7003 servers="[{Suffrage:Voter ID:node1 Address:11.164.2.90:7001} {Suffrage:Voter ID:node2 Address:11.158.128.78:7002} {Suffrage:Voter ID:node3 Address:11.165.57.57:7003}]"
2022/06/27 18:47:52 state ./node1 Leader
2022-06-27T18:47:52.419+0800 [WARN]  raft@v1.3.9/raft.go:1575: rejecting vote request since we have a leader: from=11.165.57.57:7003 leader=11.164.2.90:7001 leader-id=node1
2022-06-27T18:47:53.793+0800 [WARN]  raft@v1.3.9/raft.go:1575: rejecting vote request since we have a leader: from=11.165.57.57:7003 leader=11.164.2.90:7001 leader-id=node1
2022-06-27T18:47:54.210+0800 [ERROR] raft@v1.3.9/replication.go:629: peer has newer term, stopping replication: peer="{Voter node3 11.165.57.57:7003}"
goroutine 19 [running]:
runtime/debug.Stack()
	/usr/local/go/src/runtime/debug/stack.go:24 +0x65
runtime/debug.PrintStack()
	/usr/local/go/src/runtime/debug/stack.go:16 +0x19
github.com/hashicorp/raft.(*Raft).setState(0xc0000e4000, 0x0)
	/home/shangxu.cjy/code/gopath/pkg/mod/github.com/hashicorp/raft@v1.3.9/raft.go:1897 +0x6a
github.com/hashicorp/raft.(*Raft).leaderLoop(0xc0000e4000)
	/home/shangxu.cjy/code/gopath/pkg/mod/github.com/hashicorp/raft@v1.3.9/raft.go:633 +0x505
github.com/hashicorp/raft.(*Raft).runLeader(0xc0000e4000)
	/home/shangxu.cjy/code/gopath/pkg/mod/github.com/hashicorp/raft@v1.3.9/raft.go:519 +0x46a
github.com/hashicorp/raft.(*Raft).run(0xc0000e4000)
	/home/shangxu.cjy/code/gopath/pkg/mod/github.com/hashicorp/raft@v1.3.9/raft.go:148 +0x4f
github.com/hashicorp/raft.(*raftState).goFunc.func1()
	/home/shangxu.cjy/code/gopath/pkg/mod/github.com/hashicorp/raft@v1.3.9/state.go:146 +0x62
created by github.com/hashicorp/raft.(*raftState).goFunc
	/home/shangxu.cjy/code/gopath/pkg/mod/github.com/hashicorp/raft@v1.3.9/state.go:144 +0x92
2022-06-27T18:47:54.211+0800 [INFO]  raft@v1.3.9/replication.go:492: aborting pipeline replication: peer="{Voter node2 11.158.128.78:7002}"
2022/06/27 18:47:57 state ./node1 Follower
goroutine 99 [running]:
runtime/debug.Stack()
	/usr/local/go/src/runtime/debug/stack.go:24 +0x65
runtime/debug.PrintStack()
	/usr/local/go/src/runtime/debug/stack.go:16 +0x19
github.com/hashicorp/raft.(*Raft).setState(0xc0000e4000, 0x0)
	/home/shangxu.cjy/code/gopath/pkg/mod/github.com/hashicorp/raft@v1.3.9/raft.go:1897 +0x6a
github.com/hashicorp/raft.(*Raft).appendEntries(0xc0000e4000, {{0x716300, 0xc0004da000}, {0x0, 0x0}, 0xc000538000}, 0xc0004da000)
	/home/shangxu.cjy/code/gopath/pkg/mod/github.com/hashicorp/raft@v1.3.9/raft.go:1386 +0x37c
github.com/hashicorp/raft.(*Raft).processHeartbeat(0xc0000e4000, {{0x716300, 0xc0004da000}, {0x0, 0x0}, 0xc000538000})
	/home/shangxu.cjy/code/gopath/pkg/mod/github.com/hashicorp/raft@v1.3.9/raft.go:1353 +0x1cd
github.com/hashicorp/raft.(*NetworkTransport).handleCommand(0xc0000c8000, 0xc0005385a0, 0x7f8378, 0xc0004e17c0)
	/home/shangxu.cjy/code/gopath/pkg/mod/github.com/hashicorp/raft@v1.3.9/net_transport.go:640 +0x7b6
github.com/hashicorp/raft.(*NetworkTransport).handleConn(0xc0000c8000, {0x7f81b8, 0xc000098280}, {0x7fd718, 0xc0000100b8})
	/home/shangxu.cjy/code/gopath/pkg/mod/github.com/hashicorp/raft@v1.3.9/net_transport.go:542 +0x3a5
created by github.com/hashicorp/raft.(*NetworkTransport).listen
	/home/shangxu.cjy/code/gopath/pkg/mod/github.com/hashicorp/raft@v1.3.9/net_transport.go:520 +0x3b7

Again, restart node3 and call AddVoter on node1 AddVoter return error with "leadership lost while committing log". Now both node2 and node1 are followers.

If I restart node3 immediately, leader works well and it log show:

2022-06-27T18:59:38.013+0800 [WARN]  raft@v1.3.9/raft.go:996: failed to contact: server-id=node3 time=973.003882ms
2022-06-27T18:59:38.173+0800 [INFO]  raft@v1.3.9/replication.go:444: pipelining replication: peer="{Voter node3 11.165.57.57:7003}"
2022-06-27T18:59:38.178+0800 [INFO]  raft@v1.3.9/raft.go:1135: updating configuration: command=AddVoter server-id=node3 server-addr=11.165.57.57:7003 servers="[{Suffrage:Voter ID:node1 Address:11.164.2.90:7001} {Suffrage:Voter ID:node2 Address:11.158.128.78:7002} {Suffrage:Voter ID:node3 Address:11.165.57.57:7003}]"

I had this happen to me this morning after updating from 1.3.5 -> 1.4.1. I have a basic 3-node server setup, and something in the raft state on one of them became buggered. This caused the other two nodes to constantly reject that server's vote, leading to a state lock where all three servers were constantly fighting for, and losing, the election.

I haven't seen a raft issue in quite some time, but my guess is that on systemctl stop nomad, something is either misconfigured in my service file, or there is a race condition in nomad's own shutdown process, that causes an errant value to be written on shutdown that manifests on reboot.

Simply removing the server/raft directory and having it be restored from a remote snapshot on boot "fixes" this, allowing quorum to be met and elections had. Unfortunately I haven't had time to investigate what actually gets corrupted in the raft state itself.