Kafka broker not coming back during scale-down
ecojan opened this issue · 2 comments
Is your feature request related to a problem? Please describe.
During a scale-down operation if a broker that is to be removed is deleted in a non controlled fashion (while the data is being drained from the Kafka broker to other brokers) the operator doesn't bring the broker back up to finish this and then remove it again.
This in turn will slow down new replicas getting in sync (as they will have to pull from 2 or even worse, 1 in sync replica).
Even more concerning, if there is a K8s cluster rollout restart (due to VM upgrades for examples) if the Kafka pods don't come back up, this will result in offline partitions.
Describe the solution you'd like to see
If a broker is killed in a non controlled fashion while it's also being drained, the broker should be brought back in the cluster and removed after all replicas are removed.