Yolean/kubernetes-kafka

Using AWS ELB addresses for outside listeners

Closed this issue · 3 comments

I want to configure Kafka on my Kubernetes cluster such that it is accessible from outside. I cannot use a nodePort and the VM's IP address.

Instead, I configured one Service of type LoadBalancer for each broker and modified the init.sh to use the ELBs external IP.

OUTSIDE_HOST=$(kubectl get svc outside-${KAFKA_BROKER_ID} -o jsonpath='{.status.loadBalancer.ingress[0].hostname}')

I then created the ConfigMaps and started the kafka statefulsets. I can see that the /etc/kafka/server.properties file gets populated with the correct dns entry for the OUTSIDE HOST.

advertised.listeners=OUTSIDE://a17c8eeeavcdefd1234566-12345678.us-east-1.elb.amazonaws.com:32400,PLAINTEXT://:9092

However, the broker hostnames received outside the K8s cluster have internal cluster object names.

Metadata for all topics (from broker -1: a8eaabbccddeeff-123456.us-east-2.elb.amazonaws.com:9092/bootstrap):
 3 brokers:
  broker 2 at kafka-2.broker.kafka-test.svc.cluster.local:9092
  broker 1 at kafka-1.broker.kafka-test.svc.cluster.local:9092
  broker 0 at kafka-0.broker.kafka-test.svc.cluster.local:9092

As a result, brokers are not accessible from outside the K8S cluster.

Are there other changes to be made for each broker's ELB addresses (OUTSIDE addresses) to show up?

I haven't tried using loadbalancers, but as you've verified the generated config and I see no reason why node addresses and external addresses would be different. The only observation I've made about the wrong addresses being returned at bootstrap was fixed in 4c202f4.

Oh, I noticed now that the port is 9092 together with the external name. Can you point the loadbalancers to port 9094 instead?

Thanks! That did the trick.

For posterity: I ran into one more problem though. The init.sh script also sets the OUTSIDE_HOST and OUTSIDE_PORT as labels on the pod. Kubernetes has a limit of 63 characters on the pod label values. The AWS ELB DNS name was 71 characters and that cause the pod to have no labels.

The pods were therefore also not getting the important label of broker id (kafka-broker-id: "N"). Kubernetes service was using this label as a selector and therefore the pod couldn't be reached.

I fixed the above problem by making the OUTSIDE_HOST and OUTSIDE_PORT as annotations instead of labels [1]. After that, I was able to connect to kafka from outside AWS and was able to produce and consume messages.

[1]

     LABELS="kafka-broker-id=$KAFKA_BROKER_ID"

+    ANNOTATION=""
+
     hash kubectl 2>/dev/null || {
       sed -i "s/#init#broker.rack=#init#/#init#broker.rack=# kubectl not found in path/" /etc/kafka/server.properties
     } && {
@@ -26,18 +28,22 @@ data:
         LABELS="$LABELS kafka-broker-rack=$ZONE"
       fi

-      OUTSIDE_HOST=$(kubectl get node "$NODE_NAME" -o jsonpath='{.status.addresses[?(@.type=="InternalIP")].address}')
+      #OUTSIDE_HOST=$(kubectl get node "$NODE_NAME" -o jsonpath='{.status.addresses[?(@.type=="InternalIP")].address}')
+      OUTSIDE_HOST=$(kubectl get svc outside-${KAFKA_BROKER_ID} -o jsonpath='{.status.loadBalancer.ingress[0].hostname}')
       if [ $? -ne 0 ]; then
         echo "Outside (i.e. cluster-external access) host lookup command failed"
       else
         OUTSIDE_PORT=3240${KAFKA_BROKER_ID}
         sed -i "s|#init#advertised.listeners=OUTSIDE://#init#|advertised.listeners=OUTSIDE://${OUTSIDE_HOST}:${OUTSIDE_PORT}|" /etc/kafka/server.properties
-        LABELS="$LABELS kafka-listener-outside-host=$OUTSIDE_HOST kafka-listener-outside-port=$OUTSIDE_PORT"
+        ANNOTATION="kafka-listener-outside-host=$OUTSIDE_HOST kafka-listener-outside-port=$OUTSIDE_PORT"
       fi

       if [ ! -z "$LABELS" ]; then
         kubectl -n $POD_NAMESPACE label pod $POD_NAME $LABELS || echo "Failed to label $POD_NAMESPACE.$POD_NAME - RBAC issue?"
       fi
+      if [ ! -z "$ANNOTATION" ]; then
+        kubectl -n $POD_NAMESPACE annotate pods $POD_NAME $ANNOTATION || echo "Failed to annotate $POD_NAMESPACE.$POD_NAME - RBAC issue?"
+      fi
     }

@shrinandj I was unaware of the limit on label values. It makes sense. Care to review #137?