banzaicloud/koperator

Kafka Cluster Disaster Recovery (Persistent disk with Retain Policy)

tanuj83 opened this issue · 0 comments

Problem Statement

Using Kafka cluster with Persistent disk and in case Kafka cluster gets deleted we will loose whole Kafka cluster data.

  1. The persistent reclaimPolicy: Retain in storage class doesn't help in this problem as this is not stateful set and not even static pod name. Once the cluster gets recreated, it has new pod name which gets new PV assigned, so no recovery of data.
  2. The alternate is, we use reclaimPolicy: Delete in storage class and use KafkaBackup (faster recovery if huge data on disks ) along with MirrorMaker2 (for remaining data recovery), but this is a costly solution and will not save the purpose of cost saving on the K8s. This will increase operation work too.

Proposed Solution

The proposal to have reclaimPolicy: Retain working in cluster, either by hardcoded the broker pod name or using only format like <clusterName>-<brokerID>, that way the broker pod name will not be changed and we can reuse same PV

Alternatives Considered

Additional Context