DebanjanSarkar/pyspark-maestro
This repo contains implementations of PySpark for real-world use cases for batch data processing, streaming data processing sourced from Kafka, sockets, etc., spark optimizations, business specific bigdata processing scenario solutions, and machine learning use cases.
Jupyter Notebook
No issues in this repository yet.