Duke University - Department of Statistical Science Computing Bootcamp 2021

This repository contains the computing bootcamp materials for incoming Ph.D. and M.S. students to the Department of Statistical Science at Duke University. These materials are adapted from those developed by Shawn Santo, Mine Çetinkaya-Rundel and Colin Rundel.

Getting started

Computing resources

Duke computing resources and getting help
- Duke VPN
- Duke software
- Compute cluster
DSS computing resources and getting help
- RStudio Pro servers

Version control and R

Version control

Introduce git and GitHub
Initiate a project directory, understand the git workflow
Discuss the role of version control in reproducibility
Discuss version control best practices

Introduction to reproducible research

Recognize the problems that reproducible research helps address, featuring a brief discussion of case studies gone wrong and how reproducible research could have possibly helped
Identify pain points in getting your analysis to be reproducible
The role of documentation, sharing, automation, and organization in making your research more reproducible
Introduce some tools to solve these problems, specifically R / RStudio / R Markdown

Organizing your project to facilitate reproducible research

Organize projects and folders to enable reproducibility and reusability
Understand the structure of data files and the importance of documenting all changes made
Create a reproducible project workflow using R / RStudio / R Markdown

R / RStudio and R Markdown

Navigate R Markdown and RStudio
Analyze data and create graphics with the package tidyverse
Discuss workflow

Python

Navigate Jupyter notebooks
Introduce Python data structures, control flow, functions, and the basics of object oriented programming
Discuss popular Python packages including NumPy, SciPy, pandas, matplotlib, seaborn, and scikit-learn
Highlight similarities and differences between Python and R

References

See slides for references related to specific topics.

XinranSong/computing_bootcamp_2021