In this project, we will attempt to build a model which is able to predict the odds of success of a bet, at any point in a tennis game. Currently, the best published models in tennis match prediction have accuracies hovering around 64%. We will attempt to obtain a similar or better performance. A dataset containing the in-game statistics of over 1400 US Open, Australia Open, French Open and Wimbledon Open Tennis Competitions has been downloaded from the UC Irvine Machine Learning Repository (https://archive.ics.uci.edu/ml/datasets/Tennis+Major+Tournament+Match+Statistics) and processed for the study. The performance of different classification techniques, such as Support Vector Classifiers, Logit Regression and Decision Tree classifiers, will be evaluated and compared.
DESCRIPTION OF DATA FEATURES
FSP First Serve Percentage
FSW First Serve Won by player
SSP Second Serve Percentage for player
SSW Second Serve Won by player
ACE Aces won by player
DBF Double Faults committed by player
WNR Winners earned by player
UFE Unforced Errors committed by player
BPC Break Points Created by player
BPW Break Points Won by player
NPA Net Points Attempted by player
NPW Net Points Won by player
Result - Outcome of the match (0 for loss / 1 for win) --> Target