deuce
is an R package that provides easy access to a rich set of online data on professional tennis. By making tennis data more available to R users, deuce
aims to be a useful tool for tennis analysts and a fun resource for teachers of statistics.
To install in R, use the devtools
package and the following:
library(devtools)
install_github("skoval/deuce")
There are 274 MB of data included with the package so the installation may take several minutes.
To find out about the datasets and functions included in deuce
, you can use the following command to bring up the package index.
help(package = "deuce")
Any of the individual datasets can be loaded with the data
command. For example, the following command brings the atp_matches
data into the R environment and runs a summary on all of the columns.
data(atp_matches)
summary(atp_matches)
The make.R
file under the package parent directory does all of the pre-processing for the major historical datasets. All of the data sources can be accessed over an internet connection. The only change the user would have to make in order to update their local package would be to change the package_root
to the path where their local version of deuce
lives.
There are some analytic functions and some functions for fetching additional tennis data from the Web. One example of an analytic function is the elo_prediction
which computes the win chances for a player against a specific opponent given both player Elo ratings. Suppose, the player has an Elo rating of 2100. What is their implied win chance versus a player with a rating of 1950? We can compute that as follows:
elo_prediction(2100, 1950)
An example of one of the data-scraping functions is fetch_activity
. When connected to the Internet, this can be used to retrieve the match results for an ATP player for a specific year of for their career. As an example, let's show how we would fetch the 2017 match results for Rafael Nadal.
fetch_activity("Rafael Nadal", 2017)
There are also serveral tidy
functions for pre-processing the major datasets that are included with the package.
For users interested in updating or running their own player Elo ratings, I would recommend looking at the Rcpp implementation of @martiningram, which you can find here