/Udacity-Data-Engineering-Projects

My solutions for the Udacity Data Engineering Nanodegree

Primary LanguageJupyter Notebook

Udacity Data Engineering Projects

This repository contains my solutions to the course "Udacity Data Engineering Nanodegree" in summer 2019

Projects

Project Folder Description Done
Project 1a - PostgreSQL Building a star schema in PostgreSQL and inserting data via Python ✔️
Project 1b - Cassandra Building a star schema in Cassandra and inserting data via Python ✔️
Project 2 - AWS Redshift Building a star schema in AWS Redshift and inserting data from AWS S3 via Python ✔️
Project 3 - Spark Reading and transforming data from AWS S3 with Spark to parse them in partitioned parquet files ✔️
Project 4 - Airflow Pipelines Building an Airflow Pipeline to automate parsing and transforming files from AWS S3 to AWS Redshift ✔️
Project 5 - Capstone Project Integrating files from S3 into PostgreSQL via Spark ✔️