Pipeline for processing raw 10X scRNA-seq data to obtain .loom
and .rds
files, as well as running the default Seurat analysis pipeline.
There are two tools available for generating count matrices: Cellranger
by 10XGenomics and Velocyto
by the Linnarsson Lab. The latter includes both intronic and exonic reads when generating count matrices, which is needed for counting lncRNAs.
TODO
Just provide experiment name - the script will look for the experiment name in /nfsdata/data/data-runs/170907-kirkeby-mistr/
Example
bash run_velocyto.sh d5d-5000_cells
Example
./loom2rds.R /scratch/tstannius/velocyto/loom-files/d5d-5000_cells.loom
This will save a Seurat object in out/d5d-5000_cells_raw
.
This is needed for the demultiplexing step.
E.g. find markers, dimensionality reduction or integration of datasets.
This was the first experiment to be sequenced and it should be treated differently for various reasons.
- The day 14 MISTR tissue was supposed to be split in 5 parts, A-E, and sequenced separately, since we were not aware of HTO's. Thus it is not necessary to run CITE-seq count and do demultiplexing. (QUA: What about other sorts of QC?)
- The day 14 MISTR tissue subregions were accidentally mixed. Originally, there should have been 5 runs (A-E), but A and B were pooled with C. Thus we have c, d and e.
The suggested strategy for the day 14 datasets is instead to integrate them using the Seurat approach (QUA: Use RNA or integrated assay?).