cafebazaar/blacksmith

Leader election through etcd

remohammadi opened this issue · 1 comments

If multiple instances are running using the same etcd cluster and the same etcd directory (-etcd-dir), only one of them should be active (response to dhcp requests), and the others should wait until the leader is killed somehow.

Most of the work is done in e42d3dc, but there was a bug in the code. goroutines aren't breakable easily. So in e29e1a2, the breaking part is removed and the restart part of HA mechanism is delegated to the service manager who manages blacksmith process (docker for example, using --restart=always will do the job).