/tidytuesday

Official repo for the #tidytuesday project

Primary LanguageHTMLCreative Commons Zero v1.0 UniversalCC0-1.0

Logo for the TidyTuesday project, represented by the word TidyTuesday over a messy splash of black paint

About TidyTuesday

  • TidyTuesday is a weekly social data project. All are welcome to participate! Please remember to share the code used to generate your results!
  • TidyTuesday is organized by the R4DS Online Learning Community. Join our Slack for free online help with R and other data-related topics, or to participate in a data-related book club!

Goals

Our over-arching goal for TidyTuesday is to make learning to work with data easier, by providing real-world datasets.

Our goal for 2023-2024 is to increase usage of #TidyTuesday within classrooms. We would like to be used in at least 10 courses by September 2024. If you are using TidyTuesday to teach data-related skills, please let us know!


How to Participate

  • Data is posted to social media every Monday morning. Follow the instructions in the new post for how to download the data.
  • Explore the data, watching out for interesting relationships. We would like to emphasize that you should not draw conclusions about causation in the data. There are various moderating variables that affect all data, many of which might not have been captured in these datasets. As such, our suggestion is to use the data provided to practice your data tidying and plotting techniques, and to consider for yourself what nuances might underlie these relationships.
  • Create a visualization, a model, a shiny app, or some other piece of data-science-related output, using R or another programming language.
  • Share your output and the code used to generate it on social media with the #TidyTuesday hashtag.

DataSets

Week Date Data Source Article
1 2023-01-03 Bring your own data to start 2023!
2 2023-01-10 Bird FeederWatch data FeederWatch Over 30 Years of Standardized Bird Counts at Supplementary Feeding Stations in North America: A Citizen Science Data Report for Project FeederWatch
3 2023-01-17 Art history data arthistory data package Quantifying Art Historical Narratives
4 2023-01-24 Alone data Alone data package Alone R package: Datasets from the survival TV series
5 2023-01-31 Pet Cats UK Movebank for Animal Tracking Data Cats on the Move
6 2023-02-07 Big Tech Stock Prices Big Tech Stock Prices on Kaggle 5 Charts on Big Tech Stocks' Collapse
7 2023-02-14 Hollywood Age Gaps Hollywood Age Gap Hollywood Age Gap
8 2023-02-21 Bob Ross Paintings Bob Ross Paintings data Bob Ross Colors data package
9 2023-02-28 African Language Sentiment AfriSenti: Sentiment Analysis dataset for 14 African languages AfriSenti: A Twitter Sentiment Analysis Benchmark for African Languages
10 2023-03-07 Numbats in Australia Atlas of Living Australia Numbat page at the Atlas of Living Australia
11 2023-03-14 European Drug Development European Medicines Agency Dissecting 28 years of European pharmaceutical development
12 2023-03-21 Programming Languages Programming Language DataBase Does every programming language have line comments?
13 2023-03-28 Time Zones IANA tz database "What Is Daylight Saving Time"
14 2023-04-04 Premier League Match Data Premier League Match Data 2021-2022 Who wins the EPL if games end at half time?
15 2023-04-11 US Egg Production Data US Egg Production Data 2007-2021 The Humane League Labs US Egg Production Dataset
16 2023-04-18 Neolithic Founder Crops The "Neolithic Founder Crops"" in Southwest Asia: Research Compendium Revisiting the concept of the 'Neolithic Founder Crops' in southwest Asia
17 2023-04-25 London Marathon London Marathon R package Scraping London Marathon data with {rvest}
18 2023-05-02 The Portal Project Portal Project Data Portal Project
19 2023-05-09 Childcare Costs National Database of Childcare Prices National Database of Childcare Prices
20 2023-05-16 Tornados NOAA's National Weather Service Storm Prediction Center Severe Weather Maps, Graphics, and Data Page Diving into US Tornado Data
21 2023-05-23 Central Park Squirrels 2018 Central Park Squirrel Census The Squirrel Census
22 2023-05-30 Verified Oldest People frankiethull: Centenarians Wikipedia: List of the verified oldest people
23 2023-06-06 Energy Energy Data Explorer Our World in Data Energy Complete Dataset
24 2023-06-13 Studying African Farmer-Led Irrigation Survey SAFI Teaching Dataset for Data Carpentry Social Sciences SAFI Teaching Dataset
25 2023-06-20 UFO Sightings Redux National UFO Reporting Center, sunrise-sunset.org TidyTuesday 2019-06-25
26 2023-06-27 US Populated Places National Map Staged Products Directory US Board of Geographic Names
27 2023-07-04 Historical Markers Historical Marker Database USA Index Database Counts and Statistics
28 2023-07-11 Global Surface Temperatures NASA GISS Surface Temperature Analysis (GISTEMP v4) Improvements in the GISTEMP Uncertainty Model
29 2023-07-18 GPT detectors GPT detectors R package GPT Detectors Are Biased Against Non-Native English Writers
30 2023-07-25 Scurvy medicaldata R package Scurvy Dataset Description
31 2023-08-01 US States List of states and territories of the United States, List of demonyms for US states and territories, and List of state and territory name etymologies of the United States List of states and territories of the United States
32 2023-08-08 Hot Ones Episodes Hot Ones and List of Hot Ones episodes Hot Ones
33 2023-08-15 Spam E-mail Spam e-mail Spam email database
34 2023-08-22 Refugees Refugees R package United Nations High Commissioner for Refugees (UNHCR) Refugee Data Finder
35 2023-08-29 Fair Use U.S. Copyright Office Fair Use Index U.S. Copyright Office Fair Use Index
36 2023-09-05 Union Membership in the United States Union Membership, Coverage, and Earnings from the CPS Five decades of CPS wages, methods, and union-nonunion wage gaps at Unionstats.com
37 2023-09-12 The Global Human Day The Global Human Day dataset The global human day PNAS article
38 2023-09-19 CRAN Package Authors The CRAN collaboration graph The CRAN collaboration graph README
39 2023-09-26 Roy Kent F**k count Deepsha Menghani posit::conf(2023) talk on data visualization and Quarto richmondway dataset
40 2023-10-03 US Government Grant Opportunities Grants 101 from Grants.gov Grants.gov Search Export

Citing TidyTuesday

To cite the TidyTuesday repo/project in publications use:

R4DS Online Learning Community (2023). Tidy Tuesday: A weekly social data project. https://github.com/rfordatascience/tidytuesday.

A BibTeX entry for LaTeX users is

  @misc{tidytuesday, 
    title = {Tidy Tuesday: A weekly social data project}, 
    author = {R4DS Online Learning Community}, 
    url = {https://github.com/rfordatascience/tidytuesday}, 
    year = {2023} 
  }

Note: If you would like to cite the tidytuesdayR package, you should use citation("tidytuesdayR") instead.


Submitting Datasets

TidyTuesday is built around open datasets that are found in the "wild" or submitted as Issues on our GitHub.

If you find a dataset that you think would be interesting, you can approach it through two ways:

Submit the dataset as an Issue

  1. Find an interesting dataset
  2. Find a report, blog post, article, etc relevant to the data
  3. Submit the dataset as an Issue along with a link to the article (and, ideally, 2 images from the article, with alt text)

Create an entire TidyTuesday challenge!

  1. Find an interesting dataset
  2. Find a report, blog post, article, etc relevant to the data (or create one yourself!)
  3. Let us know you've found something interesting and are working on it by filing an Issue on our GitHub
  4. Provide a link or the raw data and a cleaning script for the data
  5. Write a basic readme.md file using a recent readme.md as a template. Make sure to give yourself credit!