/Football-Predictions

Predictions of football matches for EPL

Primary LanguageJupyter Notebook

FOOTBALL  DATA  ANALYSIS 
 
Introduction: 
 
Football or Soccer for our American friends is more than just a game. It is played by 250 million players in over 200 countries making it the most popular sport. Each of these countries has a domestic league of their own in which teams compete for being labelled the best football team of that country. 
 
Being the staunch football fan I am, I decided to investigate the most popular domestic league of the world ((English Premier League) for factors that can influence the outcome of any match. 
 
This was my first proper data analysis project and I'm not necessarily aiming to use the results obtained to make a model or a prediction system. This project just aims to satisfy my obsession with football as well as get my hands dirty working as a Data Analyst. 
 
 
Data Set: 
 
The data set was obtained through http://www.football-data.co.uk/data.php. The data sets have been attached as separate files in the repository. A file containing the description of the data set has also been attached. Some columns related to betting statistics are missing from the data but playing statistics are there. 
 
For this particular project, I decided to download data for the 15 seasons of EPL starting from 2000-01 season to 2014-15 season. 
 
 Q) How do the results of the past matches predict the outcome of the next game? 
 
The results of the past 5 games or past 3 games have traditionally been considered a good enough metric to determine how a team will perform in the current game. Based on the past ‘n’ games we can get an idea of how the team is going to perform. For e.g. - if a team has lost its last 5 matches, it is very likely that it will either lose or draw its next match