hashicorp/raft

No good way to watch leadership

Jille opened this issue · 6 comments

Jille commented

Raft.LeaderCh() has two problems: It always returns the same channel and it doesn't prepopulate the channel with the current value. Reconstructing this has proven error prone/impossible.

RegisterObserver() was my next attempt, but the LocalAddress isn't exposed (except by the Transport, which not be available to library code).

I propose to make Raft.LeaderCh() return a new channel per call, and prepopulate that with the current value. That's a slight behavior change, but I don't expect it to break clients.

stale commented

Hey there,
We wanted to check in on this request since it has been inactive for at least 90 days.
Have you reviewed the latest godocs?
If you think this is still an important issue in the latest version of the Raft library or
its documentation please feel let us know and we'll keep it open for investigation.
If there is still no activity on this request in 30 days, we will go ahead and close it.
Thank you!

Jille commented

still relevant

Howdy @Jille,
Thank you for bringing this issue up. I have a couple of follow up questions, but I think I would benefit on understanding the use case. Are you able to elaborate on how you're using Raft.LeaderCh() and how these enhancements would benefit your implementation in a bit more detail?
Also, would you be able to elaborate on why you'd rather have it return a new channel per call?

Thank you!

Jille commented

Hey @s-christoff. I wrote a library that marks the gRPC server as unhealthy if it's not the raft leader.

With the current implementation only one place can get the channel and watch it for leader changes. If multiple places would use it, they would steal events from the channel from another and they'd all get an incomplete view. This makes it impossible for a library to use it, because it can't guarantee that the application isn't using it.

https://github.com/Jille/raft-grpc-leader-rpc/blob/reliable-leaderch/leaderhealth/leaderhealth.go#L24 checks whether my patch (#427) is in and uses it if available. The alternative is using LeaderObservation, but that has race conditions.

Hi @Jille,

Sorry this has dragged on for so long. Can you elaborate on the race you speak of with LeaderObservation? I'm aware that it can be lossy, but you can specify that you want blocking, such that events will not be discarded. I think the PR you've authored is probably fine, but I'd prefer to avoid having multiple mechanisms for doing this if possible.

stale commented

Hey there, This issue has been automatically closed because there hasn't been any activity for a while. If you are still experiencing problems, or still have questions, feel free to open a new one :+1