/Data-Analysis-Tennis

This project's about analyzing tennis stats and features to determine which are important and relevant to predict a match-winner.

Primary LanguageJupyter NotebookMIT LicenseMIT

Analyzing tennis data for future predictions

For those who don't know, I spent more than one year working on a predictive system able to make money consistently and profitably by betting on NBA games. I started this journey on September 2019 and, as of December 2020, I had found an approach that way better than what I had hoped to find.

That's when Netty was born and, an entire regular season after, I can proudly say that it yielded a 9.14% ROI betting mostly on underdogs (average odds were 2.15) and made more than 51 units in profits just in 280 games. To put an example, someone with a unit of 100€ would have finished with over 5100€ in less than 5 months. Being the ambitious person that I am, I wasn't going to settle with something that was profitable only during 6 months. I want to earn money consistently and regularly, that's why I need another model able to work the entire year.

There were different options: tennis, horse racing, greyhound racing... I ended up choosing tennis because I'm not particularly a fan of animal racing and tennis is a much more famous sport (and that usually means more data).

This project is basically about all the tinkering and analysis needed to be done before actually creating the model. What they say is true: in machine learning 80% of the time is focused on the data and 20% is trying to find the desired model. From data preparation, to data processing, to data analysis and visualization, you'll find an organized and structured project to get an idea of which stats are relevant when trying to predict a tennis match-winner.