/jeopardy_clue_dataset

A dataset containing 402,000 Jeopardy! clues (1984–2022).

jeopardy_clue_dataset

Jeopardy! Logo

This dataset contains Jeopardy! clues from Season 1 through Season 38 (July 2022). It does not contain every clue that has appeared on the show. The data source prefers not to be credited.

There are 402,416 clues in total. They can be found in combined_season1-38.tsv. Note that the file is zipped. When uncompressed it is approx. 59 MB.

There are also individual files for each season (located in the seasons folder). These files are small enough that you should be able to open them with Microsoft Excel or Google Sheets.

  • Seasons 1-12 average 5,060 clues each.
  • Seasons 13-38 average 13,142 clues each.

There is a kids_teen.tsv file which contains only clues that appeared in Kids and Teen Tournament matches.

There is a separate goat_tournament_jan2020.tsv file which covers the Jennings-Holzhauer-Rutter event.

I've done my best to clean the data and filter out clues that depend on images, video, or audio.


Column Information:

Label Description
round 1 for Single Jeopardy, 2 for Double Jeopardy, or 3 for Final Jeopardy.
value The clue's value on the board. If the clue was a Daily Double, this column will be the wagered amount.
daily_double yes or no.
category
comments The host's comments about a category.
answer
question
air_date The calendar date on which the episode first aired.
notes Indicates whether a clue appeared in a special match.

All data is property of Jeopardy Productions, Inc. and protected under law. I am not affiliated with the show. Please don't use the data to make a public-facing web site, app, or any other product.