Welcome to CSB1020H/F & CSB1021H/S - Introduction to R for Data Science!
CAGEF Training & Outreach Material by Erica Acton (erica.acton@utoronto.ca)
This repository is part of the Centre for the Analysis of Genome Evolution & Function's (CAGEF) bioinformatics training initiative. These courses and workshops were developed based on feedback of the needs and interests of the Department of Cell & Systems Biology and the Department of Ecology and Evolutionary Biology at the University of Toronto.
Course Information
Coordinators
Professor D. Guttman and Erica Acton
Offered
Winter 2019 - January 10 - February 20 (6 weeks)
Fall 2018 - September 18 - October 23 (6 weeks)
Weight
One module (0.25 FCE)
Time
Thursdays, 3:00 - 6:00pm
Location
St. George Campus, Earth Sciences Centre, Rm 3087
Description
This course is a beginner’s introduction to R and R-Studio for students who do not have a computer science background. It is intended for the student who wants to develop the skills to analyze his or her own data. Students who complete this course will be able to 1) be comfortable with the R-Studio environment, data structures and data types, 2) import data into R and manipulate data frames, 3) transform a ‘messy’ dataset into a ‘tidy’ dataset, 4) make exploratory plots, 5) use string manipulation to clean data, and 6) perform basic statistical tests and run a regression model. The structure of the class is ‘code-along’ and students are expected to bring a laptop.
Evaluation
Grades in this module will be determined by a combination of participation in in-class quizzes (6 x 5% = 30%), short assignments (5 x 10% = 50%), and a final project (20%). Short assignments require students to apply the material that they learned during each module with an emphasis on well-documented code that is concise. The final project brings together concepts from all modules by performing exploratory data analysis on a dataset of interest.
Pre-requisites
Access to a laptop computer to bring to class is REQUIRED with R and R-Studio installed (https://cloud.r-project.org/ and https://www.rstudio.com/products/rstudio/download/). There multiple choice questions on Socrative which requires an internet connection; a class key will be provided in class. Participation is required as part of your final grade.
As preparatory material for the course, students should install swirl (install.packages(‘swirl’)). When you have installed swirl, type library(swirl) and follow the prompts (ie. type what it tells you to type). From list 1 - R Programming, complete 1: Basic Building Blocks, 3: Sequences of Numbers, 4: Vectors, 7: Matrices and Data Frames.
Reading materials
A reference throughout the course will be R for Data Science (http://r4ds.had.co.nz/).
Website
All lesson materials and datasets for the course are found at https://github.com/eacton/CAGEF. Each lesson README page (linked to in 'Content' below) has a link to download the lesson folder. Assignments will be submitted to a course Dropbox.
Office Hours
By appointment: e-mail erica.acton@utoronto.ca to make an appointment.
Location: 25 Willcocks St, Room 4035
Content
Lesson 1 - Intro to R and R-Studio: Becoming Friends with the R Environment
Lesson 2 - Basic Life Skills: How to Read, Write, and Manipulate (Your Data)
Lesson 3 - Intro to Tidy Data: Go Long!
Lesson 5 - Plot all the things! From Data Exploration to Publication-Quality Figures