Scala Apache Kafka Producer and Consumer examples
Using Scala, there are 4 examples of the Producer and Consumer APIs:
- Avro Producer using the Schema Registry : com.github.polomarcus.main.MainKafkaAvroProducer
- Avro Consumer using the Schema Registry : com.github.polomarcus.main.MainKafkaAvroConsumer
- String Producer : com.github.polomarcus.main.MainKafkaProducer
- String Consumer : com.github.polomarcus.main.MainKafkaConsumer
- docker and compose (using the Conduktor's docker-compose.yml)
sbt
if not using docker to run the scala app- Optional : Conduktor (Kafka User Interface)
Start multiples kakfa servers (called brokers) using the docker compose recipe docker-compose.yml
:
docker-compose -f docker-compose.yml up --detach
sbt "runMain com.github.polomarcus.main.MainKafkaProducer"
# OR
sbt run
# and type "2" to run "com.github.polomarcus.main.MainKafkaProducer"
### Docker
```bash
docker-compose run my-scala-kafka-app bash
> sbt
> run
Your ops team tells your app is slow and the CPU is not used much, they were hoping to help you but they are not Kafka experts.
- Look at the method
producer.flush()
, can you improve the speed of the program ? - What about batching the messages ? Help
Your friendly ops team warns you about kafka disks starting to be full. What can you do ?
Tips :
- What about messages compression ? Can you implement it ? You heard that snappy compression is great.
- What about messages lifetime on your kafka brokers ? Can you change your topic config ?
After a while and a lot of deployments and autoscaling (adding and removing due to traffic spikes), on your data quality dashboard you are seeing some messages are duplicates or missing. What can you do ?
- What are "acks" ? when to use acks=0 ? when to use acks=all?
- Can idempotence help us ?
- what is "min.insync.replicas" ?
Look at :
- your docker-compose.yml, and the schema-registry service.
- Inside Conduktor, configure the connection with your schema-registry (http://localhost:8081)
- What are the benefits to use a Schema Registry for messages ? Help
- Where are stored schemas information ?
- What is serialization ? Help
- What serialization format are supported ? Help
- Why is the Avro format so compact ? Help
- What are the best practices to run a Schema Registry in production ? Help1 and Help2
- How to create a custom serializer ?
- Kafka Streams Data Types and Serialization
- About schema evolution
- https://sparkbyexamples.com/kafka/apache-kafka-consumer-producer-in-scala/
- https://www.confluent.io/fr-fr/blog/kafka-scala-tutorial-for-beginners/
- https://developer.confluent.io/learn-kafka/kafka-streams/get-started/
- Hands-on Kafka Streams in Scala
- Scala, Avro Serde et Schema registry
- Usage as a Kafka Serde (kafka lib for avro)
- Datadog's Kafka dashboard overview