Fred Hutchinson Cancer Research Center, Seattle, WA
October 27-29
Location: October 27 / 28: Arnold Building M1-A303; October 29: Thomas D1-080
Contact: Martin Morgan (content, mtmorgan@fhcrc.org); Melissa Alvendia (administration, malvendia@fhcrc.org)
This course is directed at beginning and intermediate users who would like an introduction to the analysis and comprehension of high-throughput sequence data using R and Bioconductor. Day 1 focuses on learning essential background: an introduction to the R programming language; central concepts for effective use of Bioconductor software; and an overview of high-throughput sequence analysis work flows. Day 2 emphasizes use of Bioconductor for specific tasks: an RNA-seq differential expression work flow; exploratory, machine learning, and other statistical tasks; gene set enrichment; and annotation. Day 3 transitions to understanding effective approaches for managing larger challenges: strategies for working with large data, writing re-usable functions, developing reproducible reports and work flows, and visualizing results. The course combines lectures with extensive hands-on practicals; students are required to bring a laptop with wireless internet access and a modern version of the Chrome or Safari web browser.
Day 1: Learn R / Bioconductor
- 9:00 - 10:30 Introduction to R: objects, functions, help!
- 11:00 - 12:30 Introduction to Bioconductor: working with packages and classes
- 1:30 - 5:00 (break: 3:00 - 3:30) Introduction to sequence analysis: typical work flow; data types and quality assessment; essential Bioconductor packages
Day 2: Use R / Bioconductor
- 9:00 - 12:30 (break (10:30 - 11:00) An RNA-seq differential expression work flow (detail)
- 1:30 - 2:00 Other work flows (survey): ChIP-seq, variants, copy number, epigenomics
- 2:00 - 3:00 Machine learning; exploratory and other statistical analysis
- 3:30 - 4:00 Annotating genes, genomes, and variants
- 4:00 - 5:00 Approaches to gene set enrichment
Day 3: Develop Skills and Best Practices
- 9:00 - 10:30 Working with large data
- 11:00 - 12:30 Organizing code in functions, files, and packages
- 1:30 - 3:00 Reproducible reports and work flows: markdown
- 3:30 - 4:30 Visualization
- 4:30 - 5:00 Summary