/GroundControl

De-constructing the run game to an excessive extent

Primary LanguageRGNU General Public License v2.0GPL-2.0

GroundControl

Welcome to the "Ground Control" project.

My goal is pretty simple. I want to investigate all aspects of the NFL running game to an extent that is unreasonably excessive.

You will not need to know one iota of statistics in order to follow the website. We will provide everything you need to know as it comes up, in plain English.

The Plan

I have been poking around NFL rushing data since early 2016. It was time for me to write it up for other folks to enjoy too. The chapters will introduce at least one major new idea, and will generally be pretty long. Most chapters will also have an "app" or two that will allow everybody the explore that topic themselves (and will hopefully make it easy to discover cool things for the rest of us to check out). All apps are programmed in Shiny - they're snippets of packaged visualization, analysis, and simulation code wrapped up in a pretty UI and distributed for free (everything is open-source), run through the R statistical program. And finally, between chapters, I will occasionally post "quick hits" of cool stats or stories that are related to the chapter topic.

A chapter summary of the main points can be found at the end of each of the major posts.

The table of contents (HERE) will be updated with links to the new chapters.

All figures can be batch-downloaded through the github page for the website if you want to download the lot of them.

The Data

Unless otherwise specified, I will be working with every regular-season rushing attempt by a running back from the six years between 2010 and 2015. Every other position has been removed (for now). In all, I have a database of about 71,000 individual rushing attempts.

The data is drawn from the official NFL JSON feed, through the wonderful nfldb python package. This should reflect the official scorekeeping of the NFL.

This gives me access to play-by-play data on a whole host of features, from field position to down and distance to stadium to clock time. What I do NOT currently have access to is anything relating to play charting: formations, path taken, location of initial contact, broken tackles etc. Drop a line if you can hook me up.

The Commitment to Open Data

I will be using GitHub to publish the scripts, files, and data used for this project. There are two "branches" to this project: one for the website (yes, the website is also open access - you can see the source code that generated every page, or even push recommended typo fixes and clarifications), and one for the data and analysis scripts.

The entirety of the data is already available, in full, on the Ground Control GitHub master Repository (the "rushing_data_stack.csv" in the main folder). It, and the apps, have now recently been updated for the 2016 season.

Every time I post a chapter, I will also publish the "R" script I used to generate the major findings. This github repository contains those scripts. Direct links to chapter scripts will be kept in a rolling list in the table of contents for the website (HERE).

Further, all of the interactive apps will also be available, for free, through GitHub. The source code will be available under the "Chapters/shinyapps" folder, and the apps themselves can be downloaded and run automatically with just a single command (see below).

Using the Interactive Apps

Because website hosting logistics can be difficult and expensive, all apps will be distributed through GitHub. Using them is extremely easy, even if you have minimal computer knowledge. More detailed step-by-step instructions are here. But briefly, here are the prerequisites:
1) Download and install R from this link. R is a free statistical analysis software.
2) Download and install RStudio from this link. RStudio is a useful interface for R, but more importantly, it enables built-in automatic support for the plugin the apps are built with, called “shiny”.
3) Update the packages you need. Open RStudio, and in the console, enter:

install.packages("ggplot2")
install.packages("shiny")
install.packages("reshape2")
install.packages("FNN")

That’s it. After you do these two things, whenever I publish an app, I will give you two lines of code. To run the app, you'll just need to start RStudio, then copy-paste that code into the console and hit enter. The app will download and run automatically, from within the RStudio program.

Here is an example from the chapter 2 "player distribution" app:

library("shiny")
runGitHub("Forever-Peace/GroundControl", subdir = "Chapters/shinyapps/rb_dist/")

The first line activates the "shiny" plugin that runs the app. The second line downloads the app and runs it through RStudio. You can go to the Github page to see exactly the code that is run for the app if you'd like to make sure there is no funny business (in this case, here).