/Sparkify-Data-Pipelines-with-Airflow-S3-and-Redshift

This project has to output a Dataware house solution and create high-grade data pipelines that are dynamic and built from reusable tasks, monitored, and allow easy backfills. They have also noted that the data quality plays a big part when analyses are executed on top of the data warehouse and want to run tests against their datasets after the ETL steps have been executed to catch any discrepancies in the datasets.

Primary LanguagePython

Stargazers