apache/incubator-horaedb-meta

Improve the cleaning mechanism of ShardNode

ZuLiangWang opened this issue · 0 comments

Describe this problem
We found that the failover mechanism of the HoraeDB cluster failed, and the shard was not migrated when the machine went down.

Steps to reproduce

  • Make the etcd root path configuration in HoraeDB and HoraeMeta inconsistent.
  • Shut down a HoraeDB node.

Additional Information

  • Add drop ShardNode api to deal with some extreme situations.
  • Add a new way to detect failed nodes, not only relying on etcd's lease event.
    1. Use a background thread to continuously detect failed nodes.
    2. Detect failed nodes through heartbeat.