Bringing data journalism to the sports section

Materials for this NICAR 2023 session.

Matt Waite, University of Nebraska
Derek Willis, University of Maryland
Rina Torchinsky, University of Maryland
MaryJo Webster, Star Tribune (in absentia)

Resources

Examples

Behind the Minnesota Vikings' Wild Season

We wanted a story that would especially appeal to our digital readers that heavily used data and graphics to look back at the Vikings' crazy season, just as they were heading into a playoff game. The biggest piece on this story were the play-by-play win probability charts, which is from data that you can pull using the espnscrapeR package. (My video tutorials show how to do that for one or multiple games.)

I also used play-by-play data downloaded with the espnscrapeR package to look at the point differential at the end of each quarter. The play by play data is super useful because it not only shows every play, but also has a record for the end of each quarter, the two-minute warnings and you can also find the start of overtime.

In November, we also published this data-heavy piece on the Vikings. It also leaned heavily on the win probability data. We also used some NFL NextGen stats to look at average separation stats for Justin Jefferson. You can get that data with this little snippet:

receiving_next_gen <-  load_nextgen_stats(
  seasons = TRUE,
  stat_type = "receiving",
  file_type = getOption("nflreadr.prefer", default = "rds")
)

And then we also got passing stats on Kirk Cousins that show how often he was throwing into tight coverage (known as "aggressiveness") You can get that data with this:

passing_next_gen <-  load_nextgen_stats(
  seasons = TRUE,
  stat_type = "passing",
  file_type = getOption("nflreadr.prefer", default = "rds")
)

Justin Jefferson piece, February 2023:
Most of this is just high-level stuff taken from various websites, but there are two pieces where we pulled data from an API using R. The main one is the bubble chart that shows how Jefferson led the team on offensive yards (There is a nifty package called "packcircles" that works with plotly package to make a bubble chart. The one in our story was made with other technology, though). I also pulled the receiving yards per game data from the espnscrapR package, play by play data.
What makes a fair trade for a first overall pick? Analysis of NHL Entry Draft Pick Values in Trade
There's Only One Caitlin Clark. Who Else Has "Clark-like" Stats?
Fewer Players Feast on Free Throws
Recreation Deserts in Maryland

AhanPenkar/nicar23-sports

Bringing data journalism to the sports section

Resources

Examples