Amazing Prime wants me to create an automated pipeline that takes in new data, performs the appropriate transformations, and loads the data into existing tables. I created a function that takes in the three files — Wikipedia data, Kaggle metadata, and the MovieLens rating data - and performed the ETL process before adding the data to a PostgreSQL database.
- Data ETL using Python/Jupyter Notebook
- PostgreSQL
- pgAdmin