/dropEst

Pipeline for initial analysis of droplet-based single-cell RNA-seq data

Primary LanguageC++GNU General Public License v3.0GPL-3.0

dropEst - Pipeline

Pipeline for estimating molecular count matrices for droplet-based single-cell RNA-seq measurements. If you use the pipeline in your research, please cite the corresponding paper. To reproduce results from the paper, please see this repository.

Documentation

For detailed explanations, please see the documentation

Particularly:

If you have problems with installation, please look at the Troubleshooting page and open an issue if there is nothing.

News

[0.8.6] - 2019-08-01

  • Added support for Drop-seq and CEL-Seq2

See Changelog for the full list.

General processing steps

  1. dropTag: extraction of cell barcodes and UMIs from the library. Result: demultiplexed .fastq.gz files, which should be aligned to the reference.
  2. Alignment of the demultiplexed files to reference genome. Result: .bam files with the alignment.
  3. dropEst: building count matrix and estimation of some statistics, necessary for quality control. Result: .rds file with the count matrix and statistics. Optionally: count matrix in MatrixMarket format.
  4. dropReport - Generating report on library quality.
  5. dropEstR - R pacakge for UMI count corrections and cell quality classification

Examples

Complete examples of the pipeline can be found at EXAMPLES.md.

Here are results of processing of neurons_900 10x dataset.

Supported protocols

  • 10x
  • CEL-Seq2
  • Drop-seq
  • iCLIP
  • inDrop (v1-3)
  • Seq-Well
  • SPLiT-seq

Citation

If you find this pipeline useful for your research, please consider citing the paper:

Petukhov, V., Guo, J., Baryawno, N., Severe, N., Scadden, D. T., Samsonova, M. G., & Kharchenko, P. V. (2018). dropEst: pipeline for accurate estimation of molecular counts in droplet-based single-cell RNA-seq experiments. Genome biology, 19(1), 78. doi:10.1186/s13059-018-1449-6