Starlitnightly/omicverse

DEG plot_volcano KeyError: 'explode'

Closed this issue · 4 comments

Describe the bug
Hello, thanks for your work of omicverse.
I recently applied it (version 1.5.7) on DEGs analysis, found that there are confusing problem.

  1. It runs quite soon when the dataset was CD4 T cells of 'kang.h5ad' (test_adata), while quite slowly when I just use "kang.h5ad" (adata), even though the cell number of data is twice of test_adata... adata.shape=25,4612000, test_adata.shape=11,7602000, why the speed for dds.deg_analysis process differ from each other much. And of my personal dataset, it is near 100,000 cells, a lot more difficult to process it.

  2. a KeyError: 'explode' when I tried to plot a volcano plot by using DEG results.

image
image

Thanks very much if the problems can be resolve!

Hi,

  • For the first question, I think it's because you chose the DEseq2 mode, which needs to fit the NB model, so the calculations are slower, the tutorial compares the different modes, and there is no problem using ttest.
  • For the second question, the problem is caused by too high version of adjustText, you can lower the version to 0.8, we will fix this issue in the next version of omicverse and optimise the volcano plotting.

Zehua

Hi, Zehua:

Thanks for your promptly reply, the second problem was resolved when I change the version of adjusttext as 0.7.3. While I'd like to continue discuss more with the first problem.

When I run test_adata (CD4T cells of kang.h5ad), the records are like:
image
image
When I run adata(kang.h5ad),the records are like:
image
I can't get further progress within 2 hr....
when I used my personal adata, same records with kang.h5ad, they are like:
image

It has been running for 2days, and I am doubting whether my adata has any problem?? the cost time could be so long even need to fit NB model??
image

And besides, when I run pp.qc process on my server, it shows illegal instruction: core dumped.
image
Any idea about this problem?

Thank you very much for your contribution to omicverse and assistance to me!

Best regards,
Lifang

For the first problem, I saw the differences of records are: diapersions trend curve, logres_prior and sigma_prior..... so I am thinking whether the data structures are of some mistakes?? Just guessing..

This looks like a memory overflow and it is not recommended that you use DEseq2 to analyse on all cells.