Yolean/kubernetes-kafka

Broker pods CrashLoopBackOff on minikube

guilhemmarchand opened this issue · 10 comments

Hi there !

I am trying to get it working for testing purposes, and I am having issues getting it running on minikube with the broker pods being unable to start properly.

 kubectl -n kafka get po
NAME      READY   STATUS                  RESTARTS   AGE
kafka-0   0/1     Init:Error              2          21s
kafka-1   0/1     Init:CrashLoopBackOff   1          21s
kafka-2   0/1     Init:CrashLoopBackOff   1          21s
pzoo-0    1/1     Running                 0          11m
pzoo-1    1/1     Running                 0          11m
pzoo-2    1/1     Running                 0          11m
zoo-0     1/1     Running                 0          11m
zoo-1     1/1     Running                 0          11m

Describe on one of the pods:

kubectl -n kafka describe po kafka-0
Name:           kafka-0
Namespace:      kafka
Node:           minikube/10.0.2.15
Start Time:     Sun, 09 Dec 2018 20:12:04 +0000
Labels:         app=kafka
                controller-revision-hash=kafka-56fff8c47f
                statefulset.kubernetes.io/pod-name=kafka-0
Annotations:    <none>
Status:         Pending
IP:             172.17.0.10
Controlled By:  StatefulSet/kafka
Init Containers:
  init-config:
    Container ID:  docker://08aef4a6e8b2b1d112dcee8021ef64c126163567b711d524e48993dbae74083a
    Image:         solsson/kafka-initutils@sha256:2cdb90ea514194d541c7b869ac15d2d530ca64889f56e270161fe4e5c3d076ea
    Image ID:      docker-pullable://solsson/kafka-initutils@sha256:2cdb90ea514194d541c7b869ac15d2d530ca64889f56e270161fe4e5c3d076ea
    Port:          <none>
    Host Port:     <none>
    Command:
      /bin/bash
      /etc/kafka-configmap/init.sh
    State:          Waiting
      Reason:       CrashLoopBackOff
    Last State:     Terminated
      Reason:       Error
      Exit Code:    1
      Started:      Sun, 09 Dec 2018 20:15:17 +0000
      Finished:     Sun, 09 Dec 2018 20:15:17 +0000
    Ready:          False
    Restart Count:  5
    Environment:
      NODE_NAME:       (v1:spec.nodeName)
      POD_NAME:       kafka-0 (v1:metadata.name)
      POD_NAMESPACE:  kafka (v1:metadata.namespace)
    Mounts:
      /etc/kafka from config (rw)
      /etc/kafka-configmap from configmap (rw)
      /var/run/secrets/kubernetes.io/serviceaccount from default-token-x5tbp (ro)
Containers:
  broker:
    Container ID:  
    Image:         solsson/kafka:2.1.0@sha256:ac3f06d87d45c7be727863f31e79fbfdcb9c610b51ba9cf03c75a95d602f15e1
    Image ID:      
    Ports:         9092/TCP, 9094/TCP, 5555/TCP
    Host Ports:    0/TCP, 0/TCP, 0/TCP
    Command:
      ./bin/kafka-server-start.sh
      /etc/kafka/server.properties
    State:          Waiting
      Reason:       PodInitializing
    Ready:          False
    Restart Count:  0
    Limits:
      memory:  600Mi
    Requests:
      cpu:      100m
      memory:   100Mi
    Readiness:  tcp-socket :9092 delay=0s timeout=1s period=10s #success=1 #failure=3
    Environment:
      KAFKA_LOG4J_OPTS:  -Dlog4j.configuration=file:/etc/kafka/log4j.properties
      JMX_PORT:          5555
    Mounts:
      /etc/kafka from config (rw)
      /var/lib/kafka/data from data (rw)
      /var/run/secrets/kubernetes.io/serviceaccount from default-token-x5tbp (ro)
Conditions:
  Type           Status
  Initialized    False 
  Ready          False 
  PodScheduled   True 
Volumes:
  data:
    Type:       PersistentVolumeClaim (a reference to a PersistentVolumeClaim in the same namespace)
    ClaimName:  data-kafka-0
    ReadOnly:   false
  configmap:
    Type:      ConfigMap (a volume populated by a ConfigMap)
    Name:      broker-config
    Optional:  false
  config:
    Type:    EmptyDir (a temporary directory that shares a pod's lifetime)
    Medium:  
  default-token-x5tbp:
    Type:        Secret (a volume populated by a Secret)
    SecretName:  default-token-x5tbp
    Optional:    false
QoS Class:       Burstable
Node-Selectors:  <none>
Tolerations:     node.kubernetes.io/not-ready:NoExecute for 300s
                 node.kubernetes.io/unreachable:NoExecute for 300s
Events:
  Type     Reason                 Age                    From               Message
  ----     ------                 ----                   ----               -------
  Warning  FailedScheduling       3m42s (x2 over 3m42s)  default-scheduler  pod has unbound PersistentVolumeClaims
  Normal   Scheduled              3m41s                  default-scheduler  Successfully assigned kafka-0 to minikube
  Normal   SuccessfulMountVolume  3m40s                  kubelet, minikube  MountVolume.SetUp succeeded for volume "config"
  Normal   SuccessfulMountVolume  3m40s                  kubelet, minikube  MountVolume.SetUp succeeded for volume "pvc-b1cd72fa-fbee-11e8-b5ac-080027c034e1"
  Normal   SuccessfulMountVolume  3m40s                  kubelet, minikube  MountVolume.SetUp succeeded for volume "configmap"
  Normal   SuccessfulMountVolume  3m40s                  kubelet, minikube  MountVolume.SetUp succeeded for volume "default-token-x5tbp"
  Normal   Created                2m52s (x4 over 3m40s)  kubelet, minikube  Created container
  Normal   Started                2m52s (x4 over 3m40s)  kubelet, minikube  Started container
  Warning  BackOff                2m13s (x8 over 3m38s)  kubelet, minikube  Back-off restarting failed container
  Normal   Pulled                 2m2s (x5 over 3m40s)   kubelet, minikube  Container image "solsson/kafka-initutils@sha256:2cdb90ea514194d541c7b869ac15d2d530ca64889f56e270161fe4e5c3d076ea" already present on machine

I thought initially the issue was related to unbound volumes:

kubectl -n kafka get pv
NAME                                       CAPACITY   ACCESS MODES   RECLAIM POLICY   STATUS   CLAIM                STORAGECLASS               REASON   AGE
pvc-22e23092-fbed-11e8-b5ac-080027c034e1   1Gi        RWO            Delete           Bound    kafka/data-pzoo-0    kafka-zookeeper                     15m
pvc-22e3a6a3-fbed-11e8-b5ac-080027c034e1   1Gi        RWO            Delete           Bound    kafka/data-pzoo-1    kafka-zookeeper                     15m
pvc-22e914af-fbed-11e8-b5ac-080027c034e1   1Gi        RWO            Delete           Bound    kafka/data-pzoo-2    kafka-zookeeper                     15m
pvc-25a026b8-fbed-11e8-b5ac-080027c034e1   1Gi        RWO            Delete           Bound    kafka/data-zoo-0     kafka-zookeeper-regional            15m
pvc-25a955ff-fbed-11e8-b5ac-080027c034e1   1Gi        RWO            Delete           Bound    kafka/data-zoo-1     kafka-zookeeper-regional            15m
pvc-b1cd72fa-fbee-11e8-b5ac-080027c034e1   10Gi       RWO            Delete           Bound    kafka/data-kafka-0   kafka-broker                        4m
pvc-b1d0e07e-fbee-11e8-b5ac-080027c034e1   10Gi       RWO            Delete           Bound    kafka/data-kafka-1   kafka-broker                        4m
pvc-b1d4228e-fbee-11e8-b5ac-080027c034e1   10Gi       RWO            Delete           Bound    kafka/data-kafka-2   kafka-broker                        4m
kubectl -n kafka get pvc
NAME           STATUS   VOLUME                                     CAPACITY   ACCESS MODES   STORAGECLASS               AGE
data-kafka-0   Bound    pvc-b1cd72fa-fbee-11e8-b5ac-080027c034e1   10Gi       RWO            kafka-broker               4m
data-kafka-1   Bound    pvc-b1d0e07e-fbee-11e8-b5ac-080027c034e1   10Gi       RWO            kafka-broker               4m
data-kafka-2   Bound    pvc-b1d4228e-fbee-11e8-b5ac-080027c034e1   10Gi       RWO            kafka-broker               4m
data-pzoo-0    Bound    pvc-22e23092-fbed-11e8-b5ac-080027c034e1   1Gi        RWO            kafka-zookeeper            15m
data-pzoo-1    Bound    pvc-22e3a6a3-fbed-11e8-b5ac-080027c034e1   1Gi        RWO            kafka-zookeeper            15m
data-pzoo-2    Bound    pvc-22e914af-fbed-11e8-b5ac-080027c034e1   1Gi        RWO            kafka-zookeeper            15m
data-zoo-0     Bound    pvc-25a026b8-fbed-11e8-b5ac-080027c034e1   1Gi        RWO            kafka-zookeeper-regional   15m
data-zoo-1     Bound    pvc-25a955ff-fbed-11e8-b5ac-080027c034e1   1Gi        RWO            kafka-zookeeper-regional   15m

Looking at data-kafka-0:

kubectl -n kafka describe pvc data-kafka-0
Name:          data-kafka-0
Namespace:     kafka
StorageClass:  kafka-broker
Status:        Bound
Volume:        pvc-b1cd72fa-fbee-11e8-b5ac-080027c034e1
Labels:        app=kafka
Annotations:   control-plane.alpha.kubernetes.io/leader:
                 {"holderIdentity":"e3c00091-fbeb-11e8-85fc-080027c034e1","leaseDurationSeconds":15,"acquireTime":"2018-12-09T20:12:03Z","renewTime":"2018-...
               pv.kubernetes.io/bind-completed: yes
               pv.kubernetes.io/bound-by-controller: yes
               volume.beta.kubernetes.io/storage-provisioner: k8s.io/minikube-hostpath
Finalizers:    [kubernetes.io/pvc-protection]
Capacity:      10Gi
Access Modes:  RWO
Events:
  Type       Reason                 Age    From                                                           Message
  ----       ------                 ----   ----                                                           -------
  Normal     Provisioning           9m14s  k8s.io/minikube-hostpath e3c00091-fbeb-11e8-85fc-080027c034e1  External provisioner is provisioning volume for claim "kafka/data-kafka-0"
  Normal     ProvisioningSucceeded  9m14s  k8s.io/minikube-hostpath e3c00091-fbeb-11e8-85fc-080027c034e1  Successfully provisioned volume pvc-b1cd72fa-fbee-11e8-b5ac-080027c034e1
  Normal     ExternalProvisioning   9m14s  persistentvolume-controller                                    waiting for a volume to be created, either by external provisioner "k8s.io/minikube-hostpath" or manually created by system administrator
Mounted By:  kafka-0

Or data-kafka-1:

kubectl -n kafka describe pvc data-kafka-1
Name:          data-kafka-1
Namespace:     kafka
StorageClass:  kafka-broker
Status:        Bound
Volume:        pvc-b1d0e07e-fbee-11e8-b5ac-080027c034e1
Labels:        app=kafka
Annotations:   control-plane.alpha.kubernetes.io/leader:
                 {"holderIdentity":"e3c00091-fbeb-11e8-85fc-080027c034e1","leaseDurationSeconds":15,"acquireTime":"2018-12-09T20:12:03Z","renewTime":"2018-...
               pv.kubernetes.io/bind-completed: yes
               pv.kubernetes.io/bound-by-controller: yes
               volume.beta.kubernetes.io/storage-provisioner: k8s.io/minikube-hostpath
Finalizers:    [kubernetes.io/pvc-protection]
Capacity:      10Gi
Access Modes:  RWO
Events:
  Type       Reason                 Age                    From                                                           Message
  ----       ------                 ----                   ----                                                           -------
  Normal     ExternalProvisioning   9m51s (x2 over 9m51s)  persistentvolume-controller                                    waiting for a volume to be created, either by external provisioner "k8s.io/minikube-hostpath" or manually created by system administrator
  Normal     Provisioning           9m51s                  k8s.io/minikube-hostpath e3c00091-fbeb-11e8-85fc-080027c034e1  External provisioner is provisioning volume for claim "kafka/data-kafka-1"
  Normal     ProvisioningSucceeded  9m50s                  k8s.io/minikube-hostpath e3c00091-fbeb-11e8-85fc-080027c034e1  Successfully provisioned volume pvc-b1d0e07e-fbee-11e8-b5ac-080027c034e1
Mounted By:  kafka-1

I am not sure to understand what I am missing here, Zookeeper pods start correctly.

Minikube is latest, V0.30.0, and Kubernetes is version v1.10.0.

What I did is:

  • starting a fresh minikube
  • applying configure/minikube-storageclasses.yml
  • applying namespace
  • applying all in Zookeeper
  • applying all in Kafka

logs show no other information than pod waiting for initialising.

I am missing something or misundertanding instructions ?

Thank you very much

Guilhem

re-reading the above, we can clearly see that the PVC is unbound, however I cannot figure out why as the pvc is said to be created successfully.

 kubectl -n kafka describe po kafka-0
Name:           kafka-0
Namespace:      kafka
Node:           minikube/10.0.2.15
Start Time:     Sun, 09 Dec 2018 22:26:39 +0000
Labels:         app=kafka
                controller-revision-hash=kafka-56fff8c47f
                statefulset.kubernetes.io/pod-name=kafka-0
Annotations:    <none>
Status:         Pending
IP:             172.17.0.10
Controlled By:  StatefulSet/kafka
Init Containers:
  init-config:
    Container ID:  docker://86c95130bcc4a09e89a7b168374e964a894e7f69821899c5fd2579ee8ab45b83
    Image:         solsson/kafka-initutils@sha256:2cdb90ea514194d541c7b869ac15d2d530ca64889f56e270161fe4e5c3d076ea
    Image ID:      docker-pullable://solsson/kafka-initutils@sha256:2cdb90ea514194d541c7b869ac15d2d530ca64889f56e270161fe4e5c3d076ea
    Port:          <none>
    Host Port:     <none>
    Command:
      /bin/bash
      /etc/kafka-configmap/init.sh
    State:          Waiting
      Reason:       CrashLoopBackOff
    Last State:     Terminated
      Reason:       Error
      Exit Code:    1
      Started:      Sun, 09 Dec 2018 22:26:43 +0000
      Finished:     Sun, 09 Dec 2018 22:26:43 +0000
    Ready:          False
    Restart Count:  1
    Environment:
      NODE_NAME:       (v1:spec.nodeName)
      POD_NAME:       kafka-0 (v1:metadata.name)
      POD_NAMESPACE:  kafka (v1:metadata.namespace)
    Mounts:
      /etc/kafka from config (rw)
      /etc/kafka-configmap from configmap (rw)
      /var/run/secrets/kubernetes.io/serviceaccount from default-token-vkzq2 (ro)
Containers:
  broker:
    Container ID:  
    Image:         solsson/kafka:2.1.0@sha256:ac3f06d87d45c7be727863f31e79fbfdcb9c610b51ba9cf03c75a95d602f15e1
    Image ID:      
    Ports:         9092/TCP, 9094/TCP, 5555/TCP
    Host Ports:    0/TCP, 0/TCP, 0/TCP
    Command:
      ./bin/kafka-server-start.sh
      /etc/kafka/server.properties
    State:          Waiting
      Reason:       PodInitializing
    Ready:          False
    Restart Count:  0
    Limits:
      memory:  600Mi
    Requests:
      cpu:      100m
      memory:   100Mi
    Readiness:  tcp-socket :9092 delay=0s timeout=1s period=10s #success=1 #failure=3
    Environment:
      KAFKA_LOG4J_OPTS:  -Dlog4j.configuration=file:/etc/kafka/log4j.properties
      JMX_PORT:          5555
    Mounts:
      /etc/kafka from config (rw)
      /var/lib/kafka/data from data (rw)
      /var/run/secrets/kubernetes.io/serviceaccount from default-token-vkzq2 (ro)
Conditions:
  Type           Status
  Initialized    False 
  Ready          False 
  PodScheduled   True 
Volumes:
  data:
    Type:       PersistentVolumeClaim (a reference to a PersistentVolumeClaim in the same namespace)
    ClaimName:  data-kafka-0
    ReadOnly:   false
  configmap:
    Type:      ConfigMap (a volume populated by a ConfigMap)
    Name:      broker-config
    Optional:  false
  config:
    Type:    EmptyDir (a temporary directory that shares a pod's lifetime)
    Medium:  
  default-token-vkzq2:
    Type:        Secret (a volume populated by a Secret)
    SecretName:  default-token-vkzq2
    Optional:    false
QoS Class:       Burstable
Node-Selectors:  <none>
Tolerations:     node.kubernetes.io/not-ready:NoExecute for 300s
                 node.kubernetes.io/unreachable:NoExecute for 300s
Events:
  Type     Reason                 Age                From               Message
  ----     ------                 ----               ----               -------
  Warning  FailedScheduling       21s (x2 over 21s)  default-scheduler  pod has unbound PersistentVolumeClaims
  Normal   Scheduled              20s                default-scheduler  Successfully assigned kafka-0 to minikube
  Normal   SuccessfulMountVolume  19s                kubelet, minikube  MountVolume.SetUp succeeded for volume "pvc-7ecbb17d-fc01-11e8-b20e-08002710da99"
  Normal   SuccessfulMountVolume  19s                kubelet, minikube  MountVolume.SetUp succeeded for volume "config"
  Normal   SuccessfulMountVolume  19s                kubelet, minikube  MountVolume.SetUp succeeded for volume "configmap"
  Normal   SuccessfulMountVolume  19s                kubelet, minikube  MountVolume.SetUp succeeded for volume "default-token-vkzq2"
  Normal   Pulled                 17s (x2 over 19s)  kubelet, minikube  Container image "solsson/kafka-initutils@sha256:2cdb90ea514194d541c7b869ac15d2d530ca64889f56e270161fe4e5c3d076ea" already present on machine
  Normal   Created                17s (x2 over 18s)  kubelet, minikube  Created container
  Normal   Started                16s (x2 over 18s)  kubelet, minikube  Started container
  Warning  BackOff                15s (x2 over 16s)  kubelet, minikube  Back-off restarting failed container
kubectl -n kafka describe pvc data-kafka-0
Name:          data-kafka-0
Namespace:     kafka
StorageClass:  kafka-broker
Status:        Bound
Volume:        pvc-7ecbb17d-fc01-11e8-b20e-08002710da99
Labels:        app=kafka
Annotations:   control-plane.alpha.kubernetes.io/leader:
                 {"holderIdentity":"f4d65cbf-fc00-11e8-b872-08002710da99","leaseDurationSeconds":15,"acquireTime":"2018-12-09T22:26:38Z","renewTime":"2018-...
               pv.kubernetes.io/bind-completed: yes
               pv.kubernetes.io/bound-by-controller: yes
               volume.beta.kubernetes.io/storage-provisioner: k8s.io/minikube-hostpath
Finalizers:    [kubernetes.io/pvc-protection]
Capacity:      1Gi
Access Modes:  RWO
Events:
  Type       Reason                 Age                  From                                                           Message
  ----       ------                 ----                 ----                                                           -------
  Normal     ExternalProvisioning   109s (x3 over 109s)  persistentvolume-controller                                    waiting for a volume to be created, either by external provisioner "k8s.io/minikube-hostpath" or manually created by system administrator
  Normal     Provisioning           109s                 k8s.io/minikube-hostpath f4d65cbf-fc00-11e8-b872-08002710da99  External provisioner is provisioning volume for claim "kafka/data-kafka-0"
  Normal     ProvisioningSucceeded  108s                 k8s.io/minikube-hostpath f4d65cbf-fc00-11e8-b872-08002710da99  Successfully provisioned volume pvc-7ecbb17d-fc01-11e8-b20e-08002710da99
Mounted By:  kafka-0
kubectl -n kafka describe pv pvc-7ecbb17d-fc01-11e8-b20e-08002710da99
Name:            pvc-7ecbb17d-fc01-11e8-b20e-08002710da99
Labels:          <none>
Annotations:     hostPathProvisionerIdentity: f4d65c7e-fc00-11e8-b872-08002710da99
                 pv.kubernetes.io/provisioned-by: k8s.io/minikube-hostpath
Finalizers:      [kubernetes.io/pv-protection]
StorageClass:    kafka-broker
Status:          Bound
Claim:           kafka/data-kafka-0
Reclaim Policy:  Delete
Access Modes:    RWO
Capacity:        1Gi
Node Affinity:   <none>
Message:         
Source:
    Type:          HostPath (bare host directory volume)
    Path:          /tmp/hostpath-provisioner/pvc-7ecbb17d-fc01-11e8-b20e-08002710da99
    HostPathType:  
Events:            <none>
minikube ssh
                         _             _            
            _         _ ( )           ( )           
  ___ ___  (_)  ___  (_)| |/')  _   _ | |_      __  
/' _ ` _ `\| |/' _ `\| || , <  ( ) ( )| '_`\  /'__`\
| ( ) ( ) || || ( ) || || |\`\ | (_) || |_) )(  ___/
(_) (_) (_)(_)(_) (_)(_)(_) (_)`\___/'(_,__/'`\____)

$ sudo su -
# ls -ltr /tmp/hostpath-provisioner/
total 32
drwxrwxrwx 4 root root 4096 Dec  9 22:26 pvc-1fd1352e-fc01-11e8-b20e-08002710da99
drwxrwxrwx 4 root root 4096 Dec  9 22:26 pvc-1fd7e7da-fc01-11e8-b20e-08002710da99
drwxrwxrwx 4 root root 4096 Dec  9 22:26 pvc-217496ff-fc01-11e8-b20e-08002710da99
drwxrwxrwx 4 root root 4096 Dec  9 22:26 pvc-1fd4df87-fc01-11e8-b20e-08002710da99
drwxrwxrwx 4 root root 4096 Dec  9 22:26 pvc-21708ce8-fc01-11e8-b20e-08002710da99
drwxrwxrwx 2 root root 4096 Dec  9 22:26 pvc-7ecbb17d-fc01-11e8-b20e-08002710da99
drwxrwxrwx 2 root root 4096 Dec  9 22:26 pvc-7ed7f2da-fc01-11e8-b20e-08002710da99
drwxrwxrwx 2 root root 4096 Dec  9 22:26 pvc-7edd6621-fc01-11e8-b20e-08002710da99
# ls -ltr /tmp/hostpath-provisioner/pvc-7ecbb17d-fc01-11e8-b20e-08002710da99
total 0

I think that if it was a volume issue the pods wouldn't even attempt to start. Crash loop is something else. Have you looked at the logs? It's the init container that fails so use kubectl -n kafka logs kafka-0 -c init-config.

Hi @solsson

Thanks, ha right I see yes, it seems related to the service account:

kubectl -n kafka logs kafka-0 -c init-config
+ cp /etc/kafka-configmap/log4j.properties /etc/kafka/
+ KAFKA_BROKER_ID=0
+ SEDS=("s/#init#broker.id=#init#/broker.id=$KAFKA_BROKER_ID/")
+ LABELS=kafka-broker-id=0
+ ANNOTATIONS=
+ hash kubectl
++ kubectl get node minikube '-o=go-template={{index .metadata.labels "failure-domain.beta.kubernetes.io/zone"}}'
Error from server (Forbidden): nodes "minikube" is forbidden: User "system:serviceaccount:kafka:default" cannot get nodes at the cluster scope
+ ZONE=

Ok I got them running by having the following:

kubectl create clusterrolebinding fixRBACKafka --clusterrole=cluster-admin --serviceaccount=kafka:default

Right @solsson so I restarted from fresh, I thought RBAC was not enabled by default with Minikube, looks like there are config required as well.

So the proper instructions for Minikube are:

  • starting a fresh minikube
  • applying configure/minikube-storageclasses.yml
  • applying all in rbac-namespace-default
  • applying namespace
  • applying all in Zookeeper
  • applying all in Kafka

Maybe not worth having these in the README for Minikube ?

Thanks for your help !

Guilhem

Thanks. RBAC has been default for some time, don't know how long. I intentionally keep the main README short. For example getting the error message about nonexistent namespace alerts newcomers to the fact that the repo is opinionated on namespace, and error messages from init containers could one day happen in production so it helps to get stuck there early :)

Maybe a pointer to the namespace yaml should be added to https://github.com/Yolean/kubernetes-kafka#getting-started though.

Update: hah, it's already there :)

Thanks for your help @solsson
Got it, short why doc but documentation is always a good thing.

The zookeeper README would deserve some more explanations ;-)

I just wish I knew anything about Zookeeper ;)

lol
I meant some explanations about the why Zookeeper containers are started, briefly.
I've read the issue related to it, but within comments history it is not easy to get all but that.
But no pb, was just a remark ;-)