Implement backoff strategy for Kafka connections in Kafka Lag Exporter
seglo opened this issue · 0 comments
seglo commented
When installed in a fresh cluster Kafka Lag Exporter will fail if configured/discovered Kafka clusters cannot be reached. Kafka Lag Exporter is configured to automatically discover Strimzi Kafka clusters by watching for Kafka CRD’s, but at first install it may detect the Kafka CRD before Kafka has finished coming online, and fail. It won’t attempt to connect again.
The workaround right now is to delete the pod and let its deployment recreate it once the Kafka clusters are online. Since Kafka Lag Exporter can support multiple clusters I would like to add a backoff strategy to connection attempts so it will try to connect to clusters indefinitely.
Logs for kafka-lag-exporter pod:
2019-01-17 13:26:33,410 WARN org.apache.kafka.clients.ClientUtils - Couldn't resolve server pipelines-strimzi-kafka-bootstrap.lightbend:9092 from bootstrap.servers as DNS resolution failed for pipelines-strimzi-kafka-bootstrap.lightbend
2019-01-17 13:26:33,423 ERROR akka.actor.OneForOneStrategy akka://kafkalagexporterapp/user/consumer-group-collector-pipelines-strimzi - Failed create new KafkaAdminClient
akka.actor.ActorInitializationException: akka://kafkalagexporterapp/user/consumer-group-collector-pipelines-strimzi: exception during creation
at akka.actor.ActorInitializationException$.apply(Actor.scala:193)
at akka.actor.ActorCell.create(ActorCell.scala:669)
at akka.actor.ActorCell.invokeAll$1(ActorCell.scala:523)
at akka.actor.ActorCell.systemInvoke(ActorCell.scala:545)
at akka.dispatch.Mailbox.processAllSystemMessages(Mailbox.scala:283)
at akka.dispatch.Mailbox.run(Mailbox.scala:224)
at akka.dispatch.Mailbox.exec(Mailbox.scala:235)
at akka.dispatch.forkjoin.ForkJoinTask.doExec(ForkJoinTask.java:260)
at akka.dispatch.forkjoin.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:1339)
at akka.dispatch.forkjoin.ForkJoinPool.runWorker(ForkJoinPool.java:1979)
at akka.dispatch.forkjoin.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:107)
Caused by: org.apache.kafka.common.KafkaException: Failed create new KafkaAdminClient
at org.apache.kafka.clients.admin.KafkaAdminClient.createInternal(KafkaAdminClient.java:378)
at org.apache.kafka.clients.admin.AdminClient.create(AdminClient.java:54)
at com.lightbend.kafkalagexporter.KafkaClient$.com$lightbend$kafkalagexporter$KafkaClient$$createAdminClient(KafkaClient.scala:42)
at com.lightbend.kafkalagexporter.KafkaClient.<init>(KafkaClient.scala:71)
at com.lightbend.kafkalagexporter.KafkaClient$.apply(KafkaClient.scala:18)
at com.lightbend.kafkalagexporter.MainApp$.$anonfun$clientCreator$1(MainApp.scala:26)
at com.lightbend.kafkalagexporter.ConsumerGroupCollector$.$anonfun$init$1(ConsumerGroupCollector.scala:47)
at akka.actor.typed.Behavior$DeferredBehavior$$anon$1.apply(Behavior.scala:219)
at akka.actor.typed.Behavior$.start(Behavior.scala:300)
at akka.actor.typed.internal.adapter.ActorAdapter.start(ActorAdapter.scala:145)
at akka.actor.typed.internal.adapter.ActorAdapter.preStart(ActorAdapter.scala:140)
at akka.actor.Actor.aroundPreStart(Actor.scala:528)
at akka.actor.Actor.aroundPreStart$(Actor.scala:528)
at akka.actor.typed.internal.adapter.ActorAdapter.aroundPreStart(ActorAdapter.scala:21)
at akka.actor.ActorCell.create(ActorCell.scala:652)
... 9 common frames omitted
Caused by: org.apache.kafka.common.config.ConfigException: No resolvable bootstrap urls given in bootstrap.servers
at org.apache.kafka.clients.ClientUtils.parseAndValidateAddresses(ClientUtils.java:86)
at org.apache.kafka.clients.admin.KafkaAdminClient.<init>(KafkaAdminClient.java:417)
at org.apache.kafka.clients.admin.KafkaAdminClient.createInternal(KafkaAdminClient.java:371)
... 23 common frames omitted