/Deconvolution2016

Code and manuscript files for Aaron and Karsten's deconvolution paper.

Primary LanguageTeX

Using deconvolution to normalize scRNA-seq data with many zeroes

To run the simulation code, enter simulations and then:

  1. Run lowcounts.R to perform the low-count simulations, or brittlesim.R to perform the high-count simulations.
  2. Run standerr.R to estimate the variance of the size factor estimates across methods.
  3. Run poolsim.R to compare the variability of the estimates with and without the ring arrangement.
  4. Run complexity.R to determine the time-complexity of the deconvolution method.

You can also run fewcounts.R to see behaviour with few cells, or highcounts.R to see behaviour at very high counts. The moresims directory contains additional simulations under various scenarios that were not included in the publication.

To run the real data analysis code:

  1. Make a data subdirectory and download the Zeisel et al. tables (http://linnarssonlab.org/cortex) and the Klein data (supplementary tables in GSM1599494, GSM1599499).
  2. Enter the realdata directory and run Zeisel.R and Klein.R to pre-process the data and estimate size factors for all cells in each of those two data sets.
  3. Run edgeR.R to identify DE genes in each data set, and GOAnalysis.R to perform a GO analysis on the DE genes.
  4. Run HVGAnalysis.R to identify highly variable genes in each data set.
  5. Run switchTestedgeR.R to perform the offset/covariate switching analysis.

Also, run plotKleinParam.R to generate plots that justify parameter settings in the simulations.

The manuscript directory contains all LaTeX code used to generate the manuscript. This can be compiled with make.