/analysis-flow

Data Analysis Workflows & Reproducibility Learning Resources

MIT LicenseMIT

Data Analysis Workflows & Reproducibility Learning Resources

This repository aims to collect resources relating to workflow and tooling choices that promote reproducibility and best practice in data analysis and data science projects.

The resources have been organised as:

  • R Packages
  • Books
  • Papers
  • Blog Posts
  • Talks and Videos

If you would like to make a contribution, I would be glad to include it. Please file an issue, submit a PR or email me on deanmarchiori@gmail.com


R Packages

Package About Available on
drake An R-focused pipeline toolkit for reproducibility and high-performance computing CRAN
ProjectTemplate ProjectTemplate is a system for automating the thoughtless parts of a data analysis project CRAN
workflowr A Framework for Reproducible and Collaborative Data Science CRAN
rrtools Tools for Writing Reproducible Research in R Github
orderly Lightweight Reproducible Reporting for R CRAN
fnmate A function definition generator Github
dflow Automatically setup a drake project Github
represtools Basic utility functions to support reproducible research CRAN
starters R Package for initializing projects for various R activities Github
targets Function-oriented Make-like declarative workflows for R Github

Books

Title Authors Year
Agile Data Science with R - A workflow Edwin Thoen 2020
What They Forgot to Teach You About R Jennifer Bryan, Jim Hester 2020
The Turing Way: A Handbook for Reproducible Data Science Becky Arnold, Louise Bowler, Sarah Gibson, Patricia Herterich, Rosie Higman, Kirstie Whitaker 2019

Papers

Title Citation
Packaging Data Analytical Work Reproducibly Using R (and Friends) Ben Marwick, Carl Boettiger & Lincoln Mullen (2018) Packaging Data Analytical Work Reproducibly Using R (and Friends), The American Statistician, 72:1, 80-88, DOI: 10.1080/00031305.2017.1375986
Opinionated analysis development Parker H. 2017. Opinionated analysis development. PeerJ Preprints 5:e3210v1 https://doi.org/10.7287/peerj.preprints.3210v1

Blog Posts


Talks