/Retail_Data_Project

Real-Time Retail Data Analysis using Spark Streaming Integrated with Kafka.

Primary LanguagePython

Real Time Retail Transaction Receipts Analysis using Spark Streaming integrated with Kafka.

  1. Retail Data in the form of JSON (Scripts of the transactions) are sent to a Kafka Topic.
  2. This data in the form of JSON is processed using Spark Streaming.
  3. Key Performance Parameters (KPIs) are calculated using PySpark Functions.
  4. This data is written back to HDFS to be analyzed further using several visualizations tools.