Premier League Stats

Aim

The primary objective of this project was to gather and integrate data from various sources related to the Premier League, in order to perform detailed analysis, predictions, and visualizations. An additional goal was to develop a dynamic and interactive dashboard, with real-time updates as soon as the database is modified

Architecture

alt text

Data is extracted from several football websites and stored in two locations, this repository and a cloud postgres database and this databases are updated every weekend.

Details

The ./Updating_with_BS4/extract_transform.py script web-scrapes data from football websites that holds Premier league data. The links below are site pages webscraped with beautiful soup and selenium, transformsed with PySpark and pandas and stored the data in the csv_dir folder

Database Model

alt text

Metabase Dashboard

alt text

Power Bi Dashboard

alt text

Through my expertise in data integration, visualization, and dashboard development, I was able to successfully execute this project, leveraging the power of data-driven insights to make informed decisions related to the Premier League. The resulting dashboard provided a comprehensive and intuitive view of key performance indicators, enabling us to quickly identify trends, patterns, and opportunities for improvement.