Yolean/kubernetes-kafka

Unable to successfully start pods - CrashLoopBackOff error

wesleyscholl opened this issue · 1 comments

Describe the bug
When following the installation instructions, I am unable to start the kafka or zookeeper pods successfully. They are stuck in the CrashLoopBackOff, Init:CrashLoopBackOff, or Error statuses.

To Reproduce
Steps to reproduce the behavior:

  1. Cloned the kubernetes-kafka repo

  2. Ran the following commands:

    • kubectl apply -f 00-namespace.yml
    • kubectl apply -k rbac-namespace-default/
    • kubectl apply -k zookeeper/
    • kubectl apply -k kafka/
  3. Checked pods using kubectl get pods

Expected behavior
Pods should start and be accessible to produce and consume kafka messages.

Attempted Troubleshooting

  • Restarted pods:
    - kubectl get pod <pod_name> -n <namespace> -o yaml | kubectl replace --force -f -

  • Deleted namespace and reinstalled
    - kubectl delete namespace <namespace>

  • Reset Kubernetes from Rancher desktop

  • Attempted using variants
    - kubectl apply -k github.com/Yolean/kubernetes-kafka/variants/dev-small
    - kubectl apply -k variants/scale-1/

Screenshots, Configurations, and Logs

Kubectl get pods - output
NAMESPACE     NAME                                        READY   STATUS                  RESTARTS      AGE
argo-events   kafka-0                                     0/1     Init:CrashLoopBackOff   8 (4m51s ago)   20m
argo-events   zoo-0                                       0/1     CrashLoopBackOff        8 (4m26s ago)   20m
argo-events   kafka-2                                     0/1     Init:CrashLoopBackOff   1 (15s ago)     18s
argo-events   kafka-1                                     0/1     Init:CrashLoopBackOff   1 (15s ago)     18s
argo-events   pzoo-0                                      0/1     CrashLoopBackOff        2 (11s ago)     32s
argo-events   zoo-1                                       0/1     Error                   2 (25s ago)     32s
argo-events   pzoo-2                                      0/1     CrashLoopBackOff        2 (9s ago)      31s
argo-events   pzoo-1                                      0/1     CrashLoopBackOff        2 (6s ago)      32s
Logs from kubectl logs kafka-0
Defaulted container "broker" out of: broker, init-config (init)
Error from server (BadRequest): container "broker" in pod "kafka-0" is waiting to start: PodInitializing
Logs from kubectl logs zoo-0
Defaulted container "zookeeper" out of: zookeeper, init-config (init)
[main] INFO org.apache.zookeeper.server.quorum.QuorumPeerConfig - Reading configuration from: /etc/kafka/zookeeper.properties.scale-1.zoo-0
[main] INFO org.apache.zookeeper.server.quorum.QuorumPeerConfig - clientPortAddress is 0.0.0.0:2181
[main] INFO org.apache.zookeeper.server.quorum.QuorumPeerConfig - secureClientPort is not set
[main] ERROR org.apache.zookeeper.server.quorum.QuorumPeerConfig - Invalid configuration, only one server specified (ignoring)
[main] INFO org.apache.zookeeper.server.DatadirCleanupManager - autopurge.snapRetainCount set to 3
[main] INFO org.apache.zookeeper.server.DatadirCleanupManager - autopurge.purgeInterval set to 0
[main] INFO org.apache.zookeeper.server.DatadirCleanupManager - Purge task is not scheduled.
[main] WARN org.apache.zookeeper.server.quorum.QuorumPeerMain - Either no config or no quorum defined in config, running  in standalone mode
[main] INFO org.apache.zookeeper.jmx.ManagedUtil - Log4j 1.2 jmx support not found; jmx disabled.
[main] INFO org.apache.zookeeper.server.quorum.QuorumPeerConfig - Reading configuration from: /etc/kafka/zookeeper.properties.scale-1.zoo-0
[main] INFO org.apache.zookeeper.server.quorum.QuorumPeerConfig - clientPortAddress is 0.0.0.0:2181
[main] INFO org.apache.zookeeper.server.quorum.QuorumPeerConfig - secureClientPort is not set
[main] ERROR org.apache.zookeeper.server.quorum.QuorumPeerConfig - Invalid configuration, only one server specified (ignoring)
[main] INFO org.apache.zookeeper.server.ZooKeeperServerMain - Starting server
[main] INFO org.apache.zookeeper.server.persistence.FileTxnSnapLog - zookeeper.snapshot.trust.empty : false
[main] ERROR org.apache.zookeeper.server.ZooKeeperServerMain - Unable to access datadir, exiting abnormally
org.apache.zookeeper.server.persistence.FileTxnSnapLog$DatadirException: Unable to create snap directory /var/lib/zookeeper/data/version-2
	at org.apache.zookeeper.server.persistence.FileTxnSnapLog.<init>(FileTxnSnapLog.java:147)
	at org.apache.zookeeper.server.ZooKeeperServerMain.runFromConfig(ZooKeeperServerMain.java:124)
	at org.apache.zookeeper.server.ZooKeeperServerMain.initializeAndRun(ZooKeeperServerMain.java:106)
	at org.apache.zookeeper.server.ZooKeeperServerMain.main(ZooKeeperServerMain.java:64)
	at org.apache.zookeeper.server.quorum.QuorumPeerMain.initializeAndRun(QuorumPeerMain.java:128)
	at org.apache.zookeeper.server.quorum.QuorumPeerMain.main(QuorumPeerMain.java:82)
Unable to access datadir, exiting abnormally

Environment:

  • Kubernetes: v1.25.9
  • Argo: v3.4.7
  • Argo Events: v1.7.6
  • Rancher Desktop Version 1.9.0
  • MacOS: 13.4 (22F66)

Additional context

I'm attempting to connect kafka to a local ago workflows instance to kickoff workflows when messages are produced. Please advise, thanks.

Any updates? I am still getting these error messages.