Learning R / Bioconductor for Sequence Analysis

Fred Hutchinson Cancer Research Center, Seattle, WA
October 27-29

Location: October 27 / 28: Arnold Building M1-A303; October 29: Thomas D1-080

Contact: Martin Morgan (content, mtmorgan@fhcrc.org); Melissa Alvendia (administration, malvendia@fhcrc.org)

This course is directed at beginning and intermediate users who would like an introduction to the analysis and comprehension of high-throughput sequence data using R and Bioconductor. Day 1 focuses on learning essential background: an introduction to the R programming language; central concepts for effective use of Bioconductor software; and an overview of high-throughput sequence analysis work flows. Day 2 emphasizes use of Bioconductor for specific tasks: an RNA-seq differential expression work flow; exploratory, machine learning, and other statistical tasks; gene set enrichment; and annotation. Day 3 transitions to understanding effective approaches for managing larger challenges: strategies for working with large data, writing re-usable functions, developing reproducible reports and work flows, and visualizing results. The course combines lectures with extensive hands-on practicals; students are required to bring a laptop with wireless internet access and a modern version of the Chrome or Safari web browser.

Schedule (tentative)

Day 1: Learn R / Bioconductor

9:00 - 10:30 Introduction to R: objects, functions, help!
11:00 - 12:30 Introduction to Bioconductor: working with packages and classes
1:30 - 5:00 (break: 3:00 - 3:30) Introduction to sequence analysis: typical work flow; data types and quality assessment; essential Bioconductor packages

Day 2: Use R / Bioconductor

9:00 - 12:30 (break (10:30 - 11:00) An RNA-seq differential expression work flow (detail)
1:30 - 2:00 Other work flows (survey): ChIP-seq, variants, copy number, epigenomics
2:00 - 3:00 Machine learning; exploratory and other statistical analysis
3:30 - 4:00 Annotating genes, genomes, and variants
4:00 - 5:00 Approaches to gene set enrichment

Day 3: Develop Skills and Best Practices

9:00 - 10:30 Working with large data
11:00 - 12:30 Organizing code in functions, files, and packages
1:30 - 3:00 Reproducible reports and work flows: markdown
3:30 - 4:30 Visualization
4:30 - 5:00 Summary

anykine/LearnBioconductor

Learning R / Bioconductor for Sequence Analysis

Schedule (tentative)