Orange-OpenSource/casskop

IP crossing when rebooting K8S node with multiple C* nodes

Closed this issue · 1 comments

When a K8S node reboots with 2 C* nodes of the same cluster, it may happen that node 1 is assigned the former ip of node 2 and node 2 the former ip of node 1.
C* is not happy with this situation. THe following error message is in the logs:

CassandraDaemon.java:749 - Exception encountered during startup cassandra java.lang.RuntimeException: A node with address /IP_A already exists, cancelling join. Use cassandra.replace_address if you want to replace this node.

We need to find a solution for this as our production environnement does not have enough nodes yet. Seems funny at the cloud age but that's the way things are for the near future.

After some thoughts on the subject, we have decided to implement a kill pod strategy based on the restarts of the pod.
We believe that by suppressing the pod K8s will affect it a new IP on recreation.
In the case ol the calico plugin we are confident that another ip will be given to the pod so we should be in the clear.