/datawranglR

course notes for "Data Wrangling and Presentation in R"

Primary LanguageR

Data Wrangling and Visualization in R

BISC 888-1 Directed Readings at Simon Fraser University

Course Description

When presenting research "a picture is worth a thousand words", but how do you make that picture as clear and compelling as possible? The R programming language is predominately known for statistical analysis, but R is also capable of producing publication-quality figures for scientific papers, international newspapers (e.g., The New York Times), and websites. In addition to its core functionality, sophisticated visualization in R is enhanced by supplemental packages, most notably the ggplot2 library. In this course we move beyond basic plotting and highlight some of the more powerful approaches for visualization in R. Students will learn to rapidly explore their data with the ggplot2 package and develop highly-customized figures with base graphics functions. This course will also cover the use of the plyr and reshape2 packages, which are useful tools to format, reshape, and "wrangle" data sets before plotting or analysis. This course is applicable both to students who have never used R before and those who have used R, but have not accessed its higher-powered data-manipulation and graphing capabilities. Throughout the course, the exercises and assignments will emphasize reproducible research documentation ("literate programming") in which documentation, code, and figures are combined in the easy-to-learn plain-text language Markdown.

We will meet for 2-hour blocks every 2 weeks. Students will receive 1 course credit, graded pass/fail. The final grade is determined by completion of assignments and attendance. Assignments will be given at the end of each meeting and will be due prior to the following meeting. Attendance is mandatory for a passing grade.

Course Instructors

Class size

  • Based on enrollment (limited to 20 students)

Course Highlights

  • We will meet for 2-hour sessions bi-weekly for a total of 6 sessions
  • Targeted for new and experienced R users
  • Learn basic R commands and usage
  • Produce high-quality graphics using R base graphics and ggplot
  • Construct custom figures using par
  • Learn to manipulate data quickly using plyr and reshape
  • Begin using Markdown and R together to generate reproducible reports

Course Overview

Before we start

  • Install R version 3.0.0 or higher from CRAN
  • Install R Studio from rstudio.org

Topics covered

  • Introduction to R
  • Reproducible documents with Markdown
  • R base graphics
  • Data manipulation in R: plyr and reshape
  • R grid graphics (ggplot)
  • Multipanel plotting with base graphics (layout, mfrow, split.screen)
  • Additional graphics customization (par, building plots with lines, points, etc.)

Assignments

  • Assignments are due at the start of class
  • Assignments are designed to apply the skills learned and practice using R
  • All assignments will be written in Markdown format