Cluster ID
benbjohnson opened this issue · 0 comments
Currently, LiteFS uses a loose cluster membership and nodes simply accept the state of whichever node is primary. This makes it easy to get up and running, however, this can be problem if a user accidentally connects two existing clusters together. In this case, only one node will become primary and it will cause the other nodes in both clusters to sync to its state, thus losing data from one cluster.
Cluster ID generation
To prevent this, we suggest adding a randomly generated Cluster ID to LiteFS. This ID will be automatically generated when a node moves to its "ready" state:
- After a node becomes primary, or
- After a node connects to the primary and performs a sync.
The cluster ID will be generated once and saved to a clusterid
file in the root data directory of LiteFS.
For Consul-based leases, the clusterid
should be saved long-term to "${lease.consul.key}/clusterid"
if it is not set. Any nodes attempting to become primary should check this key to ensure that it is not attempting to become primary of a different cluster.
Preventing Conflicts
Any time a node connects to another node (e.g. POST /stream
), the replica will set a Litefs-Cluster-Id
request header (if available) and the primary will set a Litefs-Cluster-Id
response header. The primary & replica should reject requests/responses from differing cluster IDs. If the replica does not have a cluster ID set, it should adopt the cluster ID of the primary.