strimzi/strimzi-canary

Canary broken pipe during a describe cluster after rolling Kafka brokers

ppatierno opened this issue · 0 comments

When canary is running but a Kafka brokers rolling happens in the cluster, on the next canary reconcile when describing the cluster using the admin client, the following happens:

canary_manager.go:119] Canary manager reconcile ...
topic.go:117] Error describing cluster: EOF
canary_manager.go:129] ... reconcile done
canary_manager.go:119] Canary manager reconcile ...
topic.go:104] Error describing cluster: write tcp 10.130.44.71:58660->10.131.41.90:9093: write: broken pipe
canary_manager.go:129] ... reconcile done

it seems to be a Sarama issue [1] which, even if close, seems not to be fixed.
Other projects like KEDA [2] and Knative [3] had same kind of problem and workaround by closing/re-creating the admin client when it happens in order to re-establish connections cleanly.
Same kind of logic should be applied in the internal TopicService.

[1] IBM/sarama#1162
[2] kedacore/keda#1463
[3] knative/eventing#427