Cluster does not rebalance well
Closed this issue · 5 comments
When bringing a downed node back into a cluster, the node does not evenly rebalance itself, leaving one or more nodes to take far more traffic than the others.
Can you clarify this?
Cassabon hashes paths to particular servers based on the number of peers specified in the configuration file. If a server disappears, those paths are simply not saved, there is no rebalancing.
The only way any rebalancing can take place is by adding or removing a Cassabon peer from the YAML configuration, and then sending any one of them a SIGHUP.
Well, for example, when rebooting the cluster, some servers handle more stats (and are busier) than others. Check out the Cassabon graph to see what I mean. There's always one or two servers handling more load than the others.
We can pull the extant paths out of Redis, and write a program to see the Pearson hash distribution we get for them.
The Pearson hash is table-driven, and we can tweak the table to get a better distribution if we have a problem there.
Sounds good.
Seems to be pretty well resolved.