/Sparkify-Data-Lake-with-Apache-Spark

This project has as output a Data Lake solution. It building an ETL pipeline that extracts their data from S3, processes them using Spark, and loads the data back into S3 as a set of dimensional tables. This will allow their analytics team to continue finding insights in what songs their users are listening to.

Primary LanguagePython

Stargazers