I thought about this project for two main reasons: showcase my habilities in pyspark and pandas, and use my programming skills in a subject that I love, which is formula 1. This is a project in development.
- Developing scripts that creates the tables needed for the pipeline using pyspark.
- Developing the same scripts with pandas
- Developing the base data engineering infrastructure of the AWS account using Terraform
Disclaimer: This project will be updated whenever I have time.