kharchenkolab/numbat

Better error handling?!

Opened this issue · 0 comments

Hi numbat developers.

I am not very sure about the genomics part of the analysis here - actually I not even read your paper well enough.
But I am really trying to get this thing working for a client.
I have problems running a whole sample using numbat second stage. The bam file has been converted into the tsv.gz file (SNPs have been called). But when I then try to assess if a cell is a clear cancer cell with genomic alterations using

result=numbat::run_numbat(
subset, # gene x cell integer UMI count matrix
ref_hca, # reference expression profile, a gene x cell type normalized expression level matrix
df_allele_ATC2, # allele dataframe generated by pileup_and_phase script
genome = "hg38",
t = 1e-5,
ncores = 10,
min_cells=100,
plot = TRUE,
out_dir = paste( sep="", './numbat_run_', sampleid,"/",batch )
)

I do not plan to use the result at all (at the moment), but am looking for the clone_post_2.tsv file.
As the subset implies - running the full sample does not work in most of the cases (have 10 samples - some tumor some healthy).
The healthy normally die with some error like that:

Running HMMs on 2 cell groups..

Warning message in mclapply(bulks %>% split(.$sample), mc.cores = ncores, function(bulk) {:
“scheduled core 1 encountered error in user code, all values of the job will be affected”
An error ocured - ignoring that! Error in find_common_diploid(bulks, gamma = gamma, alpha = alpha, ncores = ncores): Error in smooth_segs(., min_genes = min_genes) :
  No segments containing more than 10 genes for CHROM 1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22.

Why? would that not me nicer to say that these cells are likely healthy?!?

When I run putative cancer cells I get either a working analysis (which is absolutely great!) or (more likely) a really frustrating error message:

running hclust...

An error ocured - ignoring that! Error: C stack usage  7969636 is too close to the limit
```

I found an issue here addressing a similar C stack error. But that time it seams to be due to too view cells being in a single cluster.
I have installed the 'fix' but I also assume that a fix from one year ago might have managed to get into the main package - right?

This C stack error is really annoying - can you tell me how to debug that?!

And no - I can (of cause) not give you a minimal reproducible case :-( - sorry.