Bringing data journalism to the sports section

Materials for this NICAR 2023 session.

  • Matt Waite, University of Nebraska
  • Derek Willis, University of Maryland
  • Rina Torchinsky, University of Maryland
  • MaryJo Webster, Star Tribune (in absentia)

Resources

Examples

We wanted a story that would especially appeal to our digital readers that heavily used data and graphics to look back at the Vikings' crazy season, just as they were heading into a playoff game. The biggest piece on this story were the play-by-play win probability charts, which is from data that you can pull using the espnscrapeR package. (My video tutorials show how to do that for one or multiple games.)

I also used play-by-play data downloaded with the espnscrapeR package to look at the point differential at the end of each quarter. The play by play data is super useful because it not only shows every play, but also has a record for the end of each quarter, the two-minute warnings and you can also find the start of overtime.

  • In November, we also published this data-heavy piece on the Vikings. It also leaned heavily on the win probability data. We also used some NFL NextGen stats to look at average separation stats for Justin Jefferson. You can get that data with this little snippet:
receiving_next_gen <-  load_nextgen_stats(
  seasons = TRUE,
  stat_type = "receiving",
  file_type = getOption("nflreadr.prefer", default = "rds")
)

And then we also got passing stats on Kirk Cousins that show how often he was throwing into tight coverage (known as "aggressiveness") You can get that data with this:

passing_next_gen <-  load_nextgen_stats(
  seasons = TRUE,
  stat_type = "passing",
  file_type = getOption("nflreadr.prefer", default = "rds")
)