parklab/MuSiCal

[catalog] Updates on the COSMIC signatures - ValueError

jinhys opened this issue · 3 comments

Hi @Hu-JIN, thanks for developing the great tool!

I’m trying to run the "refitting" step (using musical.refit.refit()) and it seems that I get an error on the discrepancy between the number of signatures from the matrix W (derived from my input matrix X; i.e., COSMIC SBS v3.4) AND that from the matrix W from the COSMIC signature (v3.2).
The error message is:

File "/home/user/tool/miniforge3/lib/python3.10/site-packages/musical/refit.py", line 35, in refit
raise ValueError('X and W have different indices.')
ValueError: X and W have different indices.

Is there any command that we can load the latest COSMIC signatures to the built-in catalog of the MuSiCal? If not, could you please add all the updated COSMIC signature files to the existing built-in catalog list?

I’ve included a link to the downloadable COSMIC signature files for your reference:
https://cancer.sanger.ac.uk/signatures/downloads/

Please let me know if you need further information. Thank you!

Hi,

The error is because the two dataframes X and W have different indices, i.e., mutation channels. The signature files downloaded from COSMIC follow a different channel order. So you need to make sure to reorder the indices so that they match the order in your X matrix. More specifically, SBS signature files downloaded from COSMIC are following the order of A[C>A]A, A[C>A]C, A[C>A]G, A[C>A]T, A[C>G]A, A[C>G]C... Note that this is different from what we commonly use in plots: A[C>A]A, A[C>A]C, A[C>A]G, A[C>A]T, C[C>A]A, C[C>A]C...

Let me know if you have further questions!

Hi @Hu-JIN, Thanks for your reply and the detailed information! - I see, I will double-check the order of my Matrix X then. I'll reach out to you if I need further help!