[catalog] Updates on the COSMIC signatures - ValueError
jinhys opened this issue · 3 comments
Hi @Hu-JIN, thanks for developing the great tool!
I’m trying to run the "refitting" step (using musical.refit.refit()
) and it seems that I get an error on the discrepancy between the number of signatures from the matrix W (derived from my input matrix X; i.e., COSMIC SBS v3.4) AND that from the matrix W from the COSMIC signature (v3.2).
The error message is:
File "/home/user/tool/miniforge3/lib/python3.10/site-packages/musical/refit.py", line 35, in refit
raise ValueError('X and W have different indices.')
ValueError: X and W have different indices.
Is there any command that we can load the latest COSMIC signatures to the built-in catalog of the MuSiCal? If not, could you please add all the updated COSMIC signature files to the existing built-in catalog list?
I’ve included a link to the downloadable COSMIC signature files for your reference:
https://cancer.sanger.ac.uk/signatures/downloads/
Please let me know if you need further information. Thank you!
Hi,
The error is because the two dataframes X and W have different indices, i.e., mutation channels. The signature files downloaded from COSMIC follow a different channel order. So you need to make sure to reorder the indices so that they match the order in your X matrix. More specifically, SBS signature files downloaded from COSMIC are following the order of A[C>A]A, A[C>A]C, A[C>A]G, A[C>A]T, A[C>G]A, A[C>G]C... Note that this is different from what we commonly use in plots: A[C>A]A, A[C>A]C, A[C>A]G, A[C>A]T, C[C>A]A, C[C>A]C...
Let me know if you have further questions!