DEG plot_volcano KeyError: 'explode'

Question

DEG plot_volcano KeyError: 'explode'

Closed this issue 5 months ago · 4 comments

Describe the bug
Hello, thanks for your work of omicverse.
I recently applied it (version 1.5.7) on DEGs analysis, found that there are confusing problem.

It runs quite soon when the dataset was CD4 T cells of 'kang.h5ad' (test_adata), while quite slowly when I just use "kang.h5ad" (adata), even though the cell number of data is twice of test_adata... adata.shape=25,4612000, test_adata.shape=11,7602000, why the speed for dds.deg_analysis process differ from each other much. And of my personal dataset, it is near 100,000 cells, a lot more difficult to process it.
a KeyError: 'explode' when I tried to plot a volcano plot by using DEG results.

Thanks very much if the problems can be resolve!

Answer 1 · 2024-01-20T19:29:16.000Z

Hi,

For the first question, I think it's because you chose the DEseq2 mode, which needs to fit the NB model, so the calculations are slower, the tutorial compares the different modes, and there is no problem using ttest.
For the second question, the problem is caused by too high version of adjustText, you can lower the version to 0.8, we will fix this issue in the next version of omicverse and optimise the volcano plotting.

Zehua

Answer 2 · 2024-01-21T07:10:31.000Z

Hi, Zehua:

Thanks for your promptly reply, the second problem was resolved when I change the version of adjusttext as 0.7.3. While I'd like to continue discuss more with the first problem.

When I run test_adata (CD4T cells of kang.h5ad), the records are like:

When I run adata(kang.h5ad),the records are like:

I can't get further progress within 2 hr....
when I used my personal adata, same records with kang.h5ad, they are like:

It has been running for 2days, and I am doubting whether my adata has any problem?? the cost time could be so long even need to fit NB model??

And besides, when I run pp.qc process on my server, it shows illegal instruction: core dumped.

Any idea about this problem?

Thank you very much for your contribution to omicverse and assistance to me!

Best regards,
Lifang

Answer 3 · 2024-01-21T07:15:04.000Z

For the first problem, I saw the differences of records are: diapersions trend curve, logres_prior and sigma_prior..... so I am thinking whether the data structures are of some mistakes?? Just guessing..

Answer 4 · 2024-01-21T08:13:24.000Z

This looks like a memory overflow and it is not recommended that you use DEseq2 to analyse on all cells.