- Retail Data in the form of JSON (Scripts of the transactions) are sent to a Kafka Topic.
- This data in the form of JSON is processed using Spark Streaming.
- Key Performance Parameters (KPIs) are calculated using PySpark Functions.
- This data is written back to HDFS to be analyzed further using several visualizations tools.
shardul-rajhans/Retail_Data_Project
Real-Time Retail Data Analysis using Spark Streaming Integrated with Kafka.
Python