Quentin-M/etcd-cloud-operator

How does clustering work? (Also, joining an additional cluster)

jaxxstorm opened this issue · 8 comments

I've got 3 etcd nodes running behind a consul load balancer, so I set:

etcd:
    advertise-address: eco.service.consul

I can also confirm all the hosts are registered:

[root@i-033aa5b5aafb1f2b5 lbriggs]# host eco.service.consul
eco.service.consul has address <addr>
eco.service.consul has address <addr>
eco.service.consul has address <addr>

However, for some reason, the nodes never create a cluster. I have 3 nodes in 3 availability zones, and I would expect them to form a cluster but I end up with 3 distinct single node clusters.

Is there any magic I'm missing?

Okay, figured this out. I had followed a previous implementation I use for our autoscaling groups of adding one ASG per AZ.

This isn't the way to do it. etcd-cloud-operator uses the ASG name to discover other hosts in the to cluster with, so instead you need one ASG spanning all your AZs

Dear @jaxxstorm,

Sorry, I've been out sick the past few days. I am glad you figured it out. Note that the discovery mechanism can be extended as necessary :)

Interesting. @Quentin-M could we use that to add an etcd-cloud-operator cluster to an existing etcd cluster for migration? We'd like to replace our current etcd cluster using static ec2 instances with this, and figured we'd have to snapshot and migrate. If we could use etcd-cloud-operator to add to an existing cluster, our lives would be much easier...

If that's possible, do you have a quick exampl,e?

Reopening for now for the question :)

@jaxxstorm The core logic expects all the discovered instances to be running etcd-cloud-operator, as it will then coordinate between them to make decisions. The recovery logic is the one that requires this the most. The startup logic however could be easily be tricked/hacked into joining an existing cluster if the etcd client was not to simply use the instances returned by the ASG but some of the existing instances, which would make it determine that the cluster is healthy. From there, I imagine your ECO nodes will have joined your main cluster, and you should be able to decommission your old EC2 instances assuming they are all behind the same LB (so your clients do not suffer), and then ECO could be patched back (without breaking quorum) to using ASG instances only - to ensure recovery logic will work moving forward.

Dumping a snapshot in S3 may be easiest, if you can afford some amount of downtime for your client applications while you make the switch from one cluster to the other. The operator also supports reading snapshots from disk if that's easier for you.

Instead of hacking the client logic, we may also potentially add some sort of additional state to the state machine (operator.go, before any other state), that would use some sort of temporary parameter (TBD what that is, could be an env var.. e.g ECO_BOOTSTRAP_ENDPOINTS), to force the operator to join existing etcd endpoints without looking at anything else. If the external endpoints are healthy, the state machine will happily transition & stay in the HEALTHY state. The parameter could be removed & the nodes could be decomissioned.

This way, no real hacking required, and we could have a documented, no downtime, migration path. That's probably just a few lines worth of code!

Or this env var could be used for the construction of the etcd client, but sick right now and not able to think about all the possible problematic cases. I think there are some easy programmatic solutions to this migration of you wanna avoid downtime though!

I might try and work on this, i'll rename the issue and leave it open while I ponder

Closing for inactivity. Feel free to re-open!