Koperator crashes when a broker has no storage configurations set in KafkaCluster
panyuenlau opened this issue · 1 comments
panyuenlau commented
Description
Koperator crashes when users don't have configurations set for any of the brokers under the KafkaCluster
CR.
Expected Behavior
KafkaCluster
handle the case when any of the brokers don't have storage configurations set.
Actual Behavior
Koperator crashes because of nil pointer dereference:
{"level":"info","ts":"2023-07-21T14:28:21.735Z","msg":"Observed a panic in reconciler: runtime error: invalid memory address or nil pointer dereference","controller":"KafkaCluster","controllerGroup":"kafka.banzaicloud.io","controllerKind":"KafkaCluster","KafkaCluster":{"name":"test","namespace":"default"},"namespace":"default","name":"test","reconcileID":"c2ced75e-acaf-404d-aa53-ecab89a4f1d5"}
panic: runtime error: invalid memory address or nil pointer dereference [recovered]
panic: runtime error: invalid memory address or nil pointer dereference
[signal SIGSEGV: segmentation violation code=0x1 addr=0x38 pc=0x1804819]
goroutine 504 [running]:
sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).Reconcile.func1()
/go/pkg/mod/sigs.k8s.io/controller-runtime@v0.14.6/pkg/internal/controller/controller.go:119 +0x1fa
panic({0x1d09840, 0x3464fd0})
/usr/local/go/src/runtime/panic.go:884 +0x212
github.com/banzaicloud/koperator/pkg/resources/kafka.(*Reconciler).Reconcile(0xc00056b340, {{0x2375930?, 0xc0013f3cb0?}, 0x1c54020?})
/workspace/pkg/resources/kafka/kafka.go:251 +0x2419
github.com/banzaicloud/koperator/controllers.(*KafkaClusterReconciler).Reconcile(0xc0003f20a0, {0x2370650, 0xc0013f3d40}, {{{0xc0007ecb20?, 0x10?}, {0xc0007ecb1c?, 0x40da67?}}})
/workspace/controllers/kafkacluster_controller.go:126 +0x8e3
sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).Reconcile(0x2370650?, {0x2370650?, 0xc0013f3d40?}, {{{0xc0007ecb20?, 0x1c58e60?}, {0xc0007ecb1c?, 0x0?}}})
/go/pkg/mod/sigs.k8s.io/controller-runtime@v0.14.6/pkg/internal/controller/controller.go:122 +0xc8
sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).reconcileHandler(0xc0004521e0, {0x23705a8, 0xc0007341c0}, {0x1daf980?, 0xc000e26360?})
/go/pkg/mod/sigs.k8s.io/controller-runtime@v0.14.6/pkg/internal/controller/controller.go:323 +0x3a5
sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).processNextWorkItem(0xc0004521e0, {0x23705a8, 0xc0007341c0})
/go/pkg/mod/sigs.k8s.io/controller-runtime@v0.14.6/pkg/internal/controller/controller.go:274 +0x1d9
sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).Start.func2.2()
/go/pkg/mod/sigs.k8s.io/controller-runtime@v0.14.6/pkg/internal/controller/controller.go:235 +0x85
created by sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).Start.func2
/go/pkg/mod/sigs.k8s.io/controller-runtime@v0.14.6/pkg/internal/controller/controller.go:231 +0x333
Affected Version
<= v0.25.1
Steps to Reproduce
- Intentionally not provide any storage configurations to any of the brokers in the cluster:
apiVersion: kafka.banzaicloud.io/v1beta1
kind: KafkaCluster
metadata:
name: test
spec:
...
brokers:
- id: 0
- id: 1
...
- Observe Koperator behavior.
Checklist
- I have read the contributing guidelines
- I have verified this does not duplicate an existing issue
panyuenlau commented
Root cause
Koperator expects the broker has the storage configurations set via either brokers[x].storageConfigs or brokers[x].brokerConfigGroup, and it just has a bad assumption that the users would have one of the configurations set
Potential Solutions
- When neither of the configuration is provided, Koperator gives a default storage configuration (with PVC) to the broker, e.g:
- mountPath: "/kafka-logs"
pvcSpec:
accessModes:
- ReadWriteOnce
resources:
requests:
storage: 10Gi
Note: this might require us to start to start implementing a mutation webhook in Koperator
- Handle all the potential nil pointer accesses across the current implementation, and just start the broker with not storage configuration - by default Kafka uses /tmp/kafka-logs as the log directory, and K8s uses local ephemeral storage for the pod.
Note: ephemeral storage is tied to the lifecycle of a pod, when a pod finishes or is restarted, the storage is cleared out