Desing by Frank Laércio, Software Developer at CI&T.
frank.junior@ciandt.com github.com/franklaercio
Content
- Importing libraries and files for GCP
- Fifa World Cup Data Analyzing
- Historical Results
- Ranking Fifa
- Modeling Machine Learning for Fifa World Cup Predictions
- Qatar World Cup Predictions
In this section, we understand our dataset and make some assumptions. Like a does the FIFA ranking influence the results, are the clashes in world cups important in today's results or the win set is an interesting data.
This section is an important part of building the algorithm, as we will check the previous results of the selections in matches. But we will make the following refinements to the data:
- Check match dates.
- Select the most played tournaments, as they are the most disputed.
- Remove friendlies, as they can negatively influence predictions, due to their competitiveness.
- Remove matches older than the 21st century, as soccer is very current.
In this section we will check the Fifa ranking for each team. This step will be important for determining the algorithm, as the FIFA ranking makes a calculation to rank the strongest teams.
In this section we will make the predictions after organizing the data previously done. For this, we will use the decision tree technique and a regression forest. These are algorithms provided by the sklearn library.
Para fazer o aprendizado iremos utilizar os dados da diferença de ranks e a média do rank. In addition, to verify the effectiveness of the algorithm, we will use the data of who won or lost the match.