/zs-data-science-challenge-2019

Predicting Cristiano Ronaldo’s “attempts on the goal target"

Primary LanguageJupyter Notebook

zs-data-science-challenge-2019

NOTE

This is by no means a production ready code I only use jupyter notebooks for prototyping. The submissions were made in a 2 days "competitive environment".

Problem Statement

As we know, Cristiano Ronaldo is a legend in the football world. He has played a thousand games & scored hundred of goals . Now, given the dataset of Cristiano Ronaldo’s “attempts on the goal target, in his all recorded & unrecorded matches, predict if he has scored a goal or not . Formally you are given a dataset of attempts taken by Ronaldo , Predict if he scored a goal or not.

Approach

Using Random Forests with sub-sampling (Randomly 100 rows chosen) and using ensemble of all trees gave me around 72% IMAE & 0.62 r^2 score.

Feature Importance

Feature Importance

Exposure

Python, Numpy, Pandas, Sklearn, Fastai