A script that attempts to re-balance Elasticsearch shards by size without changing the existing balancing.
By default ES balances shards over nodes by considering:
- The number of shards / node
- The number of shards / index / node
Which is great if every shard is the same size, but in reality this is not the case. Without considering the size of shards (except for watermarks) it's possible to end up with some nodes almost at watermark alongside others that are almost empty. This blog post describes it well - this script is inspired by that project. As well as size this logic can be applied to any weighting of shards. This diagram highlights the problem elasticsearch-rebalancer
attempts to solve:
The script is based around the idea of "swaps" - pairs of shards to relocate between the two nodes. Each iteration the script identifies the most-full and least-full nodes, searching through their largest/smallest shards to find a suitable swap. Ideally the largest shard on the most-full node
and the smallest shard on the least-full node
swap.
To maintain existing ES balances, shards are only considered if the node to move to does not have any other shard from the same index. This means the shards per node and shards per index per node remain the same, so ES shouldn't do any additional rebalancing.
Usage: es-rebalance [OPTIONS] ES_HOST
Options:
--iterations INTEGER Number of iterations (swaps) to execute.
--attr TEXT Node attributes in form key=value.
--commit Execute the shard reroutes (default print only).
--print-state Print the current nodes & weights and exit.
--index-name TEXT Filter the indices for swaps by name, supports
wildcards.
--max-node TEXT Force the max node to consider for shard swaps.
--min-node TEXT Force the min node to consider for shard swaps.
--one-way Disables shard swaps and simply moves max -> min. Note
after ES rebalancing is restored ES will attempt to
rebalance itself according to it's own heuristics.
--help Show this message and exit.
By default the es-rebalance
script uses shard size (in bytes) as the weight indicator. It is possible to customise this by writing your own CLI - for example:
from elasticsearch_rebalancer import make_rebalance_elasticsearch_cli
def get_shard_weight(shard):
return 1
if __name__ == '__main__':
rebalance_elasticsearch = make_rebalance_elasticsearch_cli(
get_shard_weight_function=get_shard_weight,
)
rebalance_elasticsearch()