Client can't reach the brokers
OuesFa opened this issue ยท 2 comments
Hello ๐
I'm currently using Strimzi Canary version 0.6.0 and a Kafka cluster that is managed by Strimzi, with the version 0.32.0-kafka-3.2.0.
Although my Java clients are functioning properly with no observable problems on the cluster, I've noticed that the Strimzi Canary pod is frequently restarting.
NAME READY STATUS RESTARTS AGE
platform-kafka-prober-b987bc54d-r9ldc 1/1 Running 854 (25m ago) 15d
With these errors
k logs -p platform-kafka-prober-b987bc54d-r9ldc
W0317 09:13:12.842726 1 main.go:215] Applied dynamic config {SaramaLogEnabled:false, VerbosityLogLevel:0}
I0317 09:13:12.842776 1 main.go:99] Starting Strimzi canary tool [0.6.0] with config: {BootstrapServers:[platform-kafka-kafka-bootstrap:9093], BootstrapBackoffMaxAttempts:10, BootstrapBackoffScale:5000, Topic:__strimzi_canary, TopicConfig:map[], ReconcileInterval:10000 ms, ClientID:strimzi-canary-client, ConsumerGroupID:strimzi-canary-group, ProducerLatencyBuckets:[2 5 10 20 50 100 200 400], EndToEndLatencyBuckets:[5 10 20 50 100 200 400 800], ExpectedClusterSize:-1, KafkaVersion:3.2.0,TLSEnabled:false, TLSCACert:, TLSClientCert:, TLSClientKey:, TLSInsecureSkipVerify:false,SASLMechanism:, SASLUser:, SASLPassword:, ConnectionCheckInterval:120000 ms, ConnectionCheckLatencyBuckets:[100 200 400 800 1600], StatusCheckInterval:30000 ms, StatusTimeWindow:300000 ms,DynamicConfigFile: , DynamicCanaryConfig: {SaramaLogEnabled:false, VerbosityLogLevel:0}, DynamicConfigWatcherInterval: 30000 ms}
I0317 09:13:12.842858 1 http_server.go:41] Starting HTTP server
W0317 09:13:13.602595 1 main.go:198] Error creating new Sarama client, retrying in 5000 ms: kafka: client has run out of available brokers to talk to: unexpected EOF
W0317 09:13:19.362688 1 main.go:198] Error creating new Sarama client, retrying in 10000 ms: kafka: client has run out of available brokers to talk to: unexpected EOF
W0317 09:13:30.123813 1 main.go:198] Error creating new Sarama client, retrying in 20000 ms: kafka: client has run out of available brokers to talk to: unexpected EOF
W0317 09:13:50.885749 1 main.go:198] Error creating new Sarama client, retrying in 40000 ms: kafka: client has run out of available brokers to talk to: unexpected EOF
W0317 09:14:31.649906 1 main.go:198] Error creating new Sarama client, retrying in 80000 ms: kafka: client has run out of available brokers to talk to: unexpected EOF
W0317 09:15:52.413345 1 main.go:198] Error creating new Sarama client, retrying in 160000 ms: kafka: client has run out of available brokers to talk to: unexpected EOF
W0317 09:18:33.174506 1 main.go:198] Error creating new Sarama client, retrying in 300000 ms: kafka: client has run out of available brokers to talk to: unexpected EOF
W0317 09:23:33.936327 1 main.go:198] Error creating new Sarama client, retrying in 300000 ms: kafka: client has run out of available brokers to talk to: unexpected EOF
W0317 09:28:34.698236 1 main.go:198] Error creating new Sarama client, retrying in 300000 ms: kafka: client has run out of available brokers to talk to: unexpected EOF
W0317 09:33:35.460100 1 main.go:198] Error creating new Sarama client, retrying in 300000 ms: kafka: client has run out of available brokers to talk to: unexpected EOF
E0317 09:38:36.222266 1 main.go:194] Error connecting to the Kafka cluster after 10 retries: Maximum number of attempts exceeded
F0317 09:38:36.222296 1 main.go:128] Error creating producer Sarama client: Maximum number of attempts exceeded
As I'm using the canary as a prober, I want it to function at its best. Can you suggest any methods to investigate what's causing the issue?
Thank you
Can you share the Kafka
custom resource you are using to deploy the cluster?
I notice you are using a bootstrap server on port 9093 .... does it have TLS? I don't see TLS enabled on the canary providing the certificates and so on.
Also, how do you are deploying the canary? It's not configured by the Strimzi operator so I guess you have a Deployment
for it? Can you share it please?
Hello ๐
Sorry for my late feedback.
This was indeed due to a bad config for the tls connection on the deployment side.
Thank you so much for the lead. Closing this.