hashicorp/raft

Question:does hashicorp/raft support ip dual-stack?

Yanhao opened this issue · 1 comments

Say I deployed hashicorp/raft at three dual-stack machines, and all call NewRaft() with v4 address to initialize raft group, at this time, the raft group should work fine I know, but if I restart and replace one machine with v6 address(call NewRaft() with v6 address), does the raft group will continue work fine?
What I want is to replace v4 with v6 address step by step.

banks commented

There are several parts to this answer!

Network Address Compatibility

This raft lib has a StreamTransport abstraction so the core library doesn't know anything about network details at all. We do provide a generic NetworkTransport implementation layer, and a specific TCPTransport too. I don't know of any specific testing on the tcp transport for IPv6, but I don't see anywhere that we use v4-specific assumptions: the calling code has to provide a bindAddr string and an advertise net.Addr which can represent either address type.

Even if for some reason the provided TCP transport does not do the right thing for your dual-stack setup, it's possible to implement your own transport that does and still use this library. For example Consul uses it's own wrapper to set up TLS and some custom protocol multiplexing which means none of the networking details in this library are actually used anyway!

Server Identification

The other part of you question is whether changing the IP of the configured machine will break its raft config or make it appear as a new node in the raft cluster etc.

Earlier versions of this library did rely on the IP address being the unique identifier for a node in the raft configuration which made changing IPs but keeping the same state a problem.

Now though (assuming you don't have nodes still running the old raft protocol version) you should provide a stable Unique ID with the LocalID config param. If you do so then you should be able to have nodes reconfigure with new IP addresses as long as its LocalID and raft state remains the same they should be able to rejoin the cluster. There is no automatic mechanism for peers to reconfigure and discover the change though - your application needs to do that somehow (automated or operator driven).

However you orchestrate that, the current leader will need to call AddVoter again for the new configuration. Note the docs about how calling this for an existing peer ID updates the address in all server's raft config:

raft/api.go

Lines 906 to 914 in 9174562

// AddVoter will add the given server to the cluster as a staging server. If the
// server is already in the cluster as a voter, this updates the server's address.
// This must be run on the leader or it will fail. The leader will promote the
// staging server to a voter once that server is ready. If nonzero, prevIndex is
// the index of the only configuration upon which this change may be applied; if
// another configuration entry has been added in the meantime, this request will
// fail. If nonzero, timeout is how long this server should wait before the
// configuration change log entry is appended.
func (r *Raft) AddVoter(id ServerID, address ServerAddress, prevIndex uint64, timeout time.Duration) IndexFuture {

For example in Consul, this is automated using our Gossip layer to discover an IP change.

Hope this is useful!