Informatics for RNA-seq: A web resource for analysis on the cloud


An educational tutorial and working demonstration pipeline for RNA-seq analysis including an introduction to: cloud computing, next generation sequence file formats, reference genomes, gene annotation, expression analysis, differential expression analysis, alternative splicing analysis, data visualization, and interpretation.

This repository is used to store code and certain raw materials for a detailed RNA-seq tutorial. To actually complete this tutorial, go to the RNA-seq tutorial wiki.

Citation: Malachi Griffith*, Jason R. Walker, Nicholas C. Spies, Benjamin J. Ainscough, Obi L. Griffith*. 2015. Informatics for RNA-seq: A web resource for analysis on the cloud. PLoS Comp Biol. 11(8):e1004393.

*To whom correspondence should be addressed: E-mail: mgriffit[AT]genome.wustl.edu, ogriffit[AT]genome.wustl.edu

Note: An archived version of this tutorial exists here. This version is maintained for consistency with the published materials (Griffith et al. 2015. PLoS Comp Biol.) and for past students wishing to review covered material. However, we strongly suggest that you continue with the current version of the tutorial below.

Want to contribute to the RNA-seq Wiki?

Fork it and send a pull request.


Tutorial Table of Contents

  1. Module 0 - Introduction and Cloud Computing
    1. Authors
    2. Citation and Supplementary Materials
    3. Syntax
    4. Intro to AWS Cloud Computing
    5. Logging into Amazon Cloud
    6. Unix Bootcamp
    7. Environment
    8. Resources
  2. Module 1 - Introduction to RNA sequencing
    1. Installation
    2. Reference Genomes
    3. Annotations
    4. Indexing
    5. RNA-seq Data
    6. Pre-Alignment QC
  3. Module 2 - RNA-seq Alignment and Visualization
    1. Adapter Trim
    2. Alignment
    3. IGV
    4. Alignment Visualization
    5. Alignment QC
  4. Module 3 - Expression and Differential Expression
    1. Expression
    2. Differential Expression
    3. DE Visualization
    4. Kallisto for Reference-Free Abundance Estimation
  5. Module 4 - Isoform Discovery and Alternative Expression
    1. Reference Guided Transcript Assembly
    2. de novo Transcript Assembly
    3. Transcript Assembly Merge
    4. Differential Splicing
    5. Splicing Visualization
  6. Module 5 - De novo transcript reconstruction
    1. De novo RNA-Seq Assembly and Analysis Using Trinity
  7. Module 6 - Functional Annotation of Transcripts
    1. Functional Annotation of Assembled Transcripts Using Trinotate
  8. Appendix
    1. Saving Your Results
    2. Abbreviations
    3. Lectures
    4. Practical Exercise Solutions
    5. Integrated Assignment
    6. Proposed Improvements
    7. AWS Setup