/Data-Analytics

A project which predicts the market value of footballers

Primary LanguageJupyter Notebook

Data-Analytics Project

Football player market value predictor and team recommender

Football is the most popular sport in the world, and thus there is no surprise that footballers earn a lot of money. The market value of the ones at the pinnacle hold a huge market value.
With all teams not having a big budget, the ones with a smaller budget have to manage the finances and also should ensure the best team possible with those finances.

The aim of this project is to:

  • Predict the market value of footballers given various attributes of them like age, team, overall rating and their market value from the years before.
  • Given a particular team and formation, recommend a replacement to the team such that there is an improvement in the team.

The data used is the fifa dataset from the years 2015 to 2020 , courtesy Kaggle.
The repo has the following folders:
* Code: Consists of the python and R files , run to obtain and analyze results
* Papers: Consists of the papers referred relevant to the project
* fifa-20-complete-player-dataset: This is the dataset folder. It has the all the data files used in the project
* Literature survey: Contains the literature survey document.

Code:

  • PreprocessEDA.rmd: Code for Preprocessing and Exploratory data analysis
  • RecommendationSystem.ipynb: Code for team recommendation
  • Neural_Network.py: Code for the neural network model for market value prediction
  • Lasso_regression.rmd: Code for lasso regression model for market value prediction
  • Ridge_regression.rmd: Code for ridge regression model for market value prediction
  • Linear_models.rmd: Code for the linear regression model for market value prediction
  • ts.rmd: Code for the time series

Papers

  • dataanalysis: Referred to get an idea for EDA
  • NN: Referred for neural networks
  • souryadey: Referred for neural networks
  • Recommendation: Referred for collaborative filtering

fifa-20-complete-player-dataset

It has the following data:

  • players_15.csv: Data about players from 2015
  • players_16.csv: Data about players from 2016
  • players_17.csv: Data about players from 2017
  • players_18.csv: Data about players from 2018
  • players_19.csv: Data about players from 2019
  • players_20.csv: Data about players from 2020