PoisonAlien/maftools

SBS1 counts of TCGA BRCA samples

Closed this issue · 7 comments

Hi Anand,

Question
Is there any way to extract the counts of SBS1 (N[C>T]G) trinucleotide (https://cancer.sanger.ac.uk/signatures/sbs/sbs1/) in TCGA BRCA samples?

In another words, the absolute counts of all "A[C>T]G", "C[C>T]G", "G[C>T]G", "T[C>T]G" mutations (as a proxy for SBS1 aging signature) for all TCGA BRCA samples. Please let me know.

Session info

 maftools_2.17.0

Thanks,
Praveen

Hi,
You can extract it from trinucleotideMatrix output.

Hi Anand,

I got hold of TCGA BRCA MAF (TCGA.BRCA.mutect.995c0111-d90b-4140-bee7-3845436c3b42.DR-10.0.somatic.maf.csv).

When I executed this I get the following error.

tcga.tnm = trinucleotideMatrix(maf = maf.Flt, ref_genome = 'BSgenome.Hsapiens.UCSC.hg19', add = TRUE, useSyn = TRUE)

-Extracting 5' and 3' adjacent bases
Error in .Call2("C_solve_user_SEW", refwidths, start, end, width, translate.negative.coord,  : 
  solving row 56: 'allow.nonnarrowing' is FALSE and the supplied start (81914409) is > refwidth + 1

Solved this by changing the correct reference genome BSgenome.Hsapiens.UCSC.hg38

Thanks,
Praveen

Now my question is about the trinucleotideMatrix.object$nmf_matrix values. Are those the absolute counts? I am asking because the values I got has a mean ~15 and median of 12. This seems too low for SBS1 counts. Do you think it is because those are from WES? Please let me know your opinion.

I used the option useSyn = TRUE to include synonymous variants in analysis.

Yeah, those are counts. You can confirm the number by summing up the rows of your maf in a whole or by sample.

Thanks a lot Shixiang.

This issue is stale because it has been open for 60 days with no activity.