/microservices-clickstream-analytics

Service that reads a clickstream (via a Kafka topic) and performs real-time analytics

Primary LanguageJavaMIT LicenseMIT

ClickStream Analytics Service

Instructions

  1. Start ZooKeeper and Apache Kafka.
$ ./$KAFKA_BIN/bin/zookeeper-server-start.sh ./$KAFKA_BIN/config/zookeeper.properties
$ ./$KAFKA_BIN/bin/kafka-server-start.sh ./$KAFKA_BIN/config/server.properties
  1. Write the clickstream into the clickstream Kafka topic.

  2. Compile and run the Spark application using sbt

$ cd stream-analytics-engine
$ sbt compile
$ sbt run

Generate a simulated clickstream

  1. Compile the clickstream-simulator application using maven:
$ cd clickstream-simulator
$ mvn clean compile
$ mvn package
  1. Download a clickstream dataset from here. I used this one.

  2. Create a clickstream topic to write to.

$ ./$KAFKA_BIN/bin/kafka-topics.sh --create \
--topic clickstream \
--bootstrap-server localhost:9092

$ ./$KAFKA_BIN/bin/kafka-topics.sh --list \
--bootstrap-server localhost:9092
  1. Start a console consumer
$ ./$KAFKA_BIN/bin/kafka-console-consumer.sh --topic clickstream \
--from-beginning \
--bootstrap-server localhost:9092
  1. Start the clickstream simulator
$ java -jar target/clickstreamproducer-0.1.0.jar