kharchenkolab/dropEst

DropSeq, no cells detected

MingBit opened this issue · 6 comments

Hey,

Thanks for the great pipeline. I'm having an issue about generating report from cell.counts.rds.

processing file: report.Rmd
  |..                                                               |   3%
   inline R code fragments

  |....                                                             |   5%
label: unnamed-chunk-1 (with options) 
List of 1
 $ include: logi FALSE
.......

  |.................................                                |  51%
  ordinary text without R code

  |...................................                              |  54%
label: unnamed-chunk-10 (with options) 
List of 2
 $ fig.width : num 5
 $ fig.height: num 3

Quitting from lines 97-98 (report.Rmd) 
Error in if (zero_range(as.numeric(limits))) { : 
  missing value where TRUE/FALSE needed
Calls: <Anonymous> ... <Anonymous> -> f -> <Anonymous> -> f -> <Anonymous> -> f
In addition: There were 16 warnings (use warnings() to see them)

Any ideas?!

issue code:
PlotCellsNumberLogLog(d$aligned_umis_per_cell, T, show.legend=F)

issue code:
scores <- ScorePipelineCells(d, mitochondrion.genes = if (exists("mit_genes")) mit_genes else NULL, tags.data = if (exists("tags_data")) tags_data else NULL)

Issue fixed by modifying two parameters: 👍

  1. drop_seq.xml: <reads_per_out_file>unlimited</reads_per_out_file>
  2. re-run dropest without -u

Sorry for the second question.

As I've learned from here, this package was not properly tested with drop-seq dataset. It works somehow smoothly with my drop-seq data.
The figures below were generated from one sample with -G 500. In terms of merging multiple samples, my question would be that should I filter cells (#gene > 500 & #UMI > 5000) before or after merge. Or maybe just take the 'estimated cells' from dropest, in which case I should set -G a little bit lower. Or maybe use 'estimated cells' from each sample and re-filter it by above parameters after merge.

It was estimated as 321 cells, but we only have 96 cell barcodes...
1
2
3
4

Issue fixed by modifying two parameters: +1

drop_seq.xml: <reads_per_out_file>unlimited</reads_per_out_file>
re-run dropest without -u

Interesting, thanks for the explanation! I'll take a look on what can be wrong.

As I've learned from here, this package was not properly tested with drop-seq dataset. It works somehow smoothly with my drop-seq data.

Again, thanks for the note!

The figures below were generated from one sample with -G 500. In terms of merging multiple samples, my question would be that should I filter cells (#gene > 500 & #UMI > 5000) before or after merge

Depends on which samples you mean. If it's different sequencing runs from the same drop-seq sample, then the proper way is to put all bam files from several droptag runs to the same dropest run. But if these are separate Drop-seq runs, than you need to process and filter them individually before merging.

It was estimated as 321 cells, but we only have 96 cell barcodes...

This procedure works better for larger sample sizes: it's just too few data for training. So, here your expert knowledge is definitely superior to the scoring output.
But what exactly do you mean by "we only have 96 cell barcodes"? Do you have the whitelist?

Thank you for your reply. 👍
Yes.. 96 cell barcodes would be a whitelist.
I re-run the dropest step with following changes (1+2+3, 1+3):

  1. add new <barcodes_file>../barcodes.csv</barcodes_file>
  2. change <cb>XC</cb> to <cr>XC</cr>
  3. re-run it with dropest -m

No cells could be detected.