/spark-streaming-kafka-cassandra-starter

Application built on Spark Streaming, Kafka and Cassandra.

Primary LanguageScalaMIT LicenseMIT

Spark Streaming Kafka Cassandra Starter

Spark Streaming Kafka Cassandra Starter is compact and easy-to-use example of using Kafka, Spark Streaming and Cassandra together.

Technologies

  • Java 8
  • Scala 2.10
  • Gradle
  • Kafka
  • Spark
  • Cassandra 2.2.6
  • Docker

How to run manually

Build the application:

./gradlew shadowJar

Replace file log4j.properties to build/libs and then start application via docker-compose:

docker-compose run -d spark

Open Kafka container with command:

docker exec -it $(docker-compose ps -q kafka) bash

Create topic in kafka container:

kafka-topics.sh --create --zookeeper $ZOOKEEPER --replication-factor 1 --partitions 2 --topic events

Check topic in kafka container:

kafka-topics.sh --list --zookeeper $ZOOKEEPER
kafka-topics.sh --describe --zookeeper $ZOOKEEPER --topic events

Then exit from Kafka container and open Spark container with command:

docker exec -it $(docker-compose ps -q spark) bash

Start master and worker in spark container:

./usr/local/spark/sbin/start-master.sh
./usr/local/spark/sbin/start-slave.sh $MASTER

Run spark-consumer application in spark container:

spark-submit \
--master $MASTER \
--conf spark.cassandra.connection.host=cassandra \
app/spark-consumer-0.0.1.jar kafka:9092 events

Then open Kafka container and run kafka-producer application:

docker exec -it $(docker-compose ps -q kafka) java -jar ../app/kafka-producer-0.0.1.jar

Open the cassandra container to see the results:

docker exec -it $(docker-compose ps -q cassandra) cqlsh -e "SELECT * FROM api.event"

If you want delete locked build's directories, then run:

sudo rm -rf spark-consumer/build kafka-producer/build model-common/build web-api/build

License

The MIT License (MIT)