scala-kafka-client-examples: A Scala repository from polomarcus

Scala Apache Kafka Producer and Consumer examples

Using Scala, there are 4 examples of the Producer and Consumer APIs:

Avro Producer using the Schema Registry : com.github.polomarcus.main.MainKafkaAvroProducer
Avro Consumer using the Schema Registry : com.github.polomarcus.main.MainKafkaAvroConsumer
String Producer : com.github.polomarcus.main.MainKafkaProducer
String Consumer : com.github.polomarcus.main.MainKafkaConsumer

Why using Kafka ?

Tools

docker and compose (using the Conduktor's docker-compose.yml)
sbt if not using docker to run the scala app
Optional : Conduktor (Kafka User Interface)

Start

Start multiples kakfa servers (called brokers) using the docker compose recipe docker-compose.yml :

docker-compose -f docker-compose.yml up --detach

SBT

sbt "runMain com.github.polomarcus.main.MainKafkaProducer"
# OR
sbt run
# and type "2" to run "com.github.polomarcus.main.MainKafkaProducer"

### Docker
```bash
docker-compose run my-scala-kafka-app bash
> sbt
> run

Some questions about producer and consumer

Question 1

Your ops team tells your app is slow and the CPU is not used much, they were hoping to help you but they are not Kafka experts.

Look at the method producer.flush(), can you improve the speed of the program ?
What about batching the messages ? Help

Question 2

Your friendly ops team warns you about kafka disks starting to be full. What can you do ?

Tips :

What about messages compression ? Can you implement it ? You heard that snappy compression is great.
What about messages lifetime on your kafka brokers ? Can you change your topic config ?

Question 3

After a while and a lot of deployments and autoscaling (adding and removing due to traffic spikes), on your data quality dashboard you are seeing some messages are duplicates or missing. What can you do ?

What are "acks" ? when to use acks=0 ? when to use acks=all?
Can idempotence help us ?
what is "min.insync.replicas" ?

About the Schema Registry

Intro

Look at :

your docker-compose.yml, and the schema-registry service.
Inside Conduktor, configure the connection with your schema-registry (http://localhost:8081)

Questions

What are the benefits to use a Schema Registry for messages ? Help
Where are stored schemas information ?
What is serialization ? Help
What serialization format are supported ? Help
Why is the Avro format so compact ? Help
What are the best practices to run a Schema Registry in production ? Help1 and Help2

polomarcus/scala-kafka-client-examples