/computing-bootcamp-2018

Materials for the DSS Computing Bootcamp, 2018

Primary LanguageHTMLCreative Commons Zero v1.0 UniversalCC0-1.0

DSS Computing Bootcamp

This is a 3 hour computing bootcamp for incoming PhD and MS students to the Department of Statistical Science at Duke University.

The workshop will cover the following topics:

Introduction to Reproducible Research

  • Recognize the problems that reproducible research helps address, featuring brief discussion of case studies case studies of (lack of) reproducibility gone wrong.
  • Identify pain points in getting your analysis to be reproducible.
  • The role of documentation, sharing, automation, and organization in making your research more reproducible. Introducing some tools to solve these problems, specifically R/RStudio/RMarkdown.

Organizing your project to facilitate Reproducible Research

  • Organize projects and folders to enable reproducibility and reusability
  • Understand the structure of data files and the importance of documenting all changes made
  • Using these practices, create a reproducible project workflow using R/RStudio/RMarkdown.

Version control

  • Introduction to git/GitHub as a version control tool.
  • Practice initiating a project directory, making / committing / pushing changes, and creating a pull request to someone else's remote repository.
  • Discuss the role of version control in reproducibility of one's own project as well as in collaborative projects.

Introduction to the department computing eco-system

  • Account activation and access to departmental servers.
  • Discussion of how to responsibly use distributed computing resources.

Acknowledgments