Kafka broker not coming back during scale-down

Question

Kafka broker not coming back during scale-down

ecojan opened this issue 3 years ago · 2 comments

Is your feature request related to a problem? Please describe.
During a scale-down operation if a broker that is to be removed is deleted in a non controlled fashion (while the data is being drained from the Kafka broker to other brokers) the operator doesn't bring the broker back up to finish this and then remove it again.
This in turn will slow down new replicas getting in sync (as they will have to pull from 2 or even worse, 1 in sync replica).
Even more concerning, if there is a K8s cluster rollout restart (due to VM upgrades for examples) if the Kafka pods don't come back up, this will result in offline partitions.

Describe the solution you'd like to see
If a broker is killed in a non controlled fashion while it's also being drained, the broker should be brought back in the cluster and removed after all replicas are removed.

Answer 1 · 2022-06-17T14:58:16.000Z

Dear @ecojan!
Please if you can test this solution and review the PR.
Thank you!

cc @amuraru

Answer 2 · 2022-06-22T15:01:14.000Z

Thank you @bartam1 will pick it up to test in a local environment, meanwhile will also look and review the PR! Let's continue the discussion there.