hashicorp/raft

Restarting RAFT node could fail with `panic: log not found [recovered]` when using LogCache

kozlovic opened this issue · 2 comments

Would get such stack:

panic: log not found [recovered]
	panic: log not found

goroutine 18 [running]:
testing.tRunner.func1.1(0x171e8e0, 0xc00018e190)
	/usr/local/go/src/testing/testing.go:988 +0x452
testing.tRunner.func1(0xc0001ca120)
	/usr/local/go/src/testing/testing.go:991 +0x600
panic(0x171e8e0, 0xc00018e190)
	/usr/local/go/src/runtime/panic.go:975 +0x3e3
github.com/hashicorp/raft.NewRaft(0xc00027a060, 0x188d7e0, 0xc0001c1080, 0x1892180, 0xc0001c0640, 0x188fa20, 0xc0001c0640, 0x188d7a0, 0xc0001824b0, 0x1895a20, ...)
	/Users/ivan/dev/go/src/github.com/hashicorp/raft/api.go:545 +0x2014

LogCache incorrectly caches logs prior to invoking store.StoreLogs(), which means that it would possibly cache logs that have not made it to the backend storage.
This could cause RAFT library to query for a log and get it (due to the improper cache) and keep track of a log index that is assumed to be committed while it is not. When restarting, the node would then try to look for log X and fail to find it in the backend store.

stale commented

Hey there,
We wanted to check in on this request since it has been inactive for at least 90 days.
Have you reviewed the latest godocs?
If you think this is still an important issue in the latest version of the Raft library or
its documentation please feel let us know and we'll keep it open for investigation.
If there is still no activity on this request in 30 days, we will go ahead and close it.
Thank you!

stale commented

Hey there, This issue has been automatically closed because there hasn't been any activity for a while. If you are still experiencing problems, or still have questions, feel free to open a new one :+1