sh This repository is dedicated to exploring the pivotal role of young players in the Real Madrid Football Club's first-team squad, with a focus on their impact in La Liga and other competitions like the Copa del Rey and UEFA Champions League.
Data was meticulously sourced from Wyscout, a leading provider of second-order sports data, ensuring the highest quality and relevance.
- Eduardo Camavinga
- Aurelien Tchouameni
- Jude Bellingham
- Daniel Ceballos (included for his effective passing range and deep understanding of Madrid's play style)
The primary focus was on 'passes', a fundamental aspect of team play. The data is continuous, leading to the use of regression trees rather than classifiers. The analysis aims to quantify these players' relevance in domestic leagues and their regularity, as well as their impact on other competitions.
- Data Loading and Library Importation
- Data Preprocessing: Dropping and consolidating columns, managing missing data, converting percentage columns to floats.
- Position Relevance
- Average Playing Minutes
- Interceptions
- Exploratory Data Analysis: Employing Pearson and Spearman for heatmap analysis, creating dispersion matrices, and applying OneHotEncoding to categorical position columns.
- Feature Engineering:
- Step 1 (Feature Establishment): Addition of new features based on dispersion matrix.
- Step 2 (Imputation Process): Applied to the processed data.
These steps were meticulously crafted to homogenise the data, making it more conducive for machine learning model application.
The repository includes decision tree visualisations and model validation curves to provide clear insights into the models' functioning and efficacy. The analysis hinges on machine learning models using Random Forests in both Scikit-Learn and XGBoost, chosen for their robust decision tree frameworks that can be finely tuned for optimised results.
This repository is intended for experts in Data Science, Machine Learning Engineering, and Statisticians.
The content is presented in a scholarly and professional manner, befitting the audience's expertise and the analytical nature of the subject.
pip install --upgrade pip
python3 -m pip install virtualenv
python3 -m venv env
source env/bin/activate
source env/bin/deactivate
pip3 install -r requirements.txt
Performed from Terminal Console
1. git init
2. git remote add origin ["copy here ssh or https"]
3. git remote -v
4. git add -A
5. git add .
6. git commit -m "insert here your commit"
7. git status
8. git push origin master
if you already created your repository, then:
1. git remote add origin ["copy here ssh or https"]
2. same procedure applied above
3. Note: if you already got your ReadMe.md & License.md then,
firstly request your git pull origin master. THIS IS ALWAYS A RECOMMENDED PRACTICE.
4. git push origin master