/kubernetes-kafka-small

Tweaked for single-node clusters

Primary LanguageShellApache License 2.0Apache-2.0

Kafka on Kubernetes

Transparent Kafka setup that you can grow with. Good for both experiments and production.

How to use:

  • Run a Kubernetes cluster, minikube or real.
  • Quickstart: use the kubectl applys below.
  • Kafka for real: fork and have a look at addons.
  • Join the discussion in issues and PRs.

No readable readme can properly introduce both Kafka and Kubernets, but we think the combination of the two is a great backbone for microservices. Back when we read Newman we were beginners with both. Now we've read Kleppmann, Confluent and SRE and enjoy this "Streaming Platform" lock-in 😄.

We also think the plain-yaml approach of this project is easier to understand and evolve than helm charts.

What you get

Keep an eye on kubectl --namespace kafka get pods -w.

The goal is to provide Bootstrap servers: kafka-0.broker.kafka.svc.cluster.local:9092,kafka-1.broker.kafka.svc.cluster.local:9092,kafka-2.broker.kafka.svc.cluster.local:9092 `

Zookeeper at zookeeper.kafka.svc.cluster.local:2181.

Start Zookeeper

The Kafka book recommends that Kafka has its own Zookeeper cluster with at least 5 instances.

kubectl apply -f ./zookeeper/

To support automatic migration in the face of availability zone unavailability we mix persistent and ephemeral storage.

Start Kafka

kubectl apply -f ./

You might want to verify in logs that Kafka found its own DNS name(s) correctly. Look for records like:

kubectl -n kafka logs kafka-0 | grep "Registered broker"
# INFO Registered broker 0 at path /brokers/ids/0 with addresses: PLAINTEXT -> EndPoint(kafka-0.broker.kafka.svc.cluster.local,9092,PLAINTEXT)

That's it. Just add business value 😉. For clients we tend to use librdkafka-based drivers like node-rdkafka. To use Kafka Connect and Kafka Streams you may want to take a look at our sample Dockerfiles. And don't forget the addons.

RBAC

For clusters that enfoce RBAC there's a minimal set of policies in

kubectl apply -f rbac-namespace-default/

Caution: Delete Reclaim Policy is default

In production you likely want to manually set Reclaim Policy, our your data will be gone if the generated volume claims are deleted.

This can't be done in manifests, at least not until Kubernetes 1.8.

Tests

kubectl apply -f test/
# Anything that isn't READY here is a failed test
kubectl get pods -l test-target=kafka,test-type=readiness -w --all-namespaces