/ScalaSparkKafka

This code just loads data to kafka through apache spark and reads it back.

Primary LanguageScala

ScalaSparkKafka

It just loads data to kafka through spark and reads it back.

This repository is partially based in those tutorials:

Requirements

In order to execute this code, you are going to need:

  • sbt
  • Java 8+
  • docker
  • on Windows:
    • setup winutil.exe and hadoop.dll, like here.

Setting up services

docker-compose up -d #starting kafka
docker-compose ps #checking  if expected services are running
docker-compose logs zookeeper | grep -i binding #check logs from zookeeper
docker-compose logs kafka | grep -i started #check logs from kafka

Test drive

# creating a new topic
docker-compose exec kafka kafka-topics --create --topic meu-topico-legal --partitions 1 --replication-factor 1 --if-not-exists --zookeeper zookeeper:2181
#checking topic existence
docker-compose exec kafka  kafka-topics --describe --topic meu-topico-legal --zookeeper zookeeper:2181
#Producing 100 messages
docker-compose exec kafka bash -c "seq 100 | kafka-console-producer --request-required-acks 1 --broker-list kafka:9092 --topic meu-topico-legal && echo 'Produced 100 messages.'"
#Consuming 100 messages
docker-compose exec kafka kafka-console-consumer --bootstrap-server kafka:9092 --topic meu-topico-legal --from-beginning --max-messages 100

How to run

sbt run