kharchenkolab/dropEst

Filtered cells are empty

yihming opened this issue · 6 comments

Hello,

I tried to use dropEst 0.8.5 to estimate the processed BAM files (after dropTag and alignment steps, and both were executed properly based on the stats). The command I use is

dropest -f -c config.xml -w alignment_result/alignment_*.bam

I used the indrop3 config.xml file provided in your repository.

However, the resulting count matrix contains 0 gene and 0 cell, with warning saying filtered cells are empty. Probably, filtration threshold is too strict or you forgot to run 'merge_and_filter'. Please refer to est_main.log for its details.

So could you please help me by suggesting which configuration or command-line arguments I should adjust to get my result? Thank you!

Sincerely,
Yiming Yang

Hi,
It means that your bam file miss either information about barcodes or about genes. Most probably, you just need to pass gtf file to dropest using -g option. Or it means that you used -s option in droptag.

Hi Viktor,

Thank you very much for your quick response!

I followed your suggestion to pass the gtf file using -g option in dropest:

dropest -f -c config.xml -g gene.gtf -w alignment_result/alignment_*.bam

where gene.gtf is the hg19 reference that I generated following this 10x documentation, and used in STAR alignment phase.

However, I still get the same issue as before.

For your second suspicion, I checked my droptag log, i.e. tag_main.log, and confirmed that I didn't use -s option.

Do you have any suggestion regarding my issue? I can provide more information if you need for your diagnosis.

Thank you!

Sincerely,
Yiming

Hi Yiming,
Can you please share your config and several lines of one of your .bam files? The command is samtools view ./alignment.bam| head -n 10

Sure. Here is the screen output by running the samtools view command you sent to me.

It seems that GitHub doesn't allow me upload XML files. So I convert it to txt file here.

Oh, got it. You use -f option, which means that your bam file has tags with barcode and UMI sequences. It's the case if the file is produced with CellRanger. But in your case, these sequences are encoded as read names (e.g. FTVR1!ATCGTAACTTGCACGC#TTAACG), so you don't need to use this flag.

It works now by not dropping -f option when using dropest. Thank you so much for your help!