Incorrect labels in code documentation of human_trinuc_freqs
Closed this issue · 1 comments
gevro commented
Hi,
I think the labels in your documentation of human_trinuc_freqs is incorrect. The numbers are correct and in the correct order.
However, the code documentation on the right side is not. Based on manual hg19 trinucleotidefrequency calculation, I'm quite sure this should be the order of the labels in the documentation:
[1] "ACA" "ACC" "ACG" "ACT" "CCA" "CCC" "CCG" "CCT" "GCA" "GCC" "GCG" "GCT" "TCA" "TCC" "TCG" "TCT" "ACA" "ACC" "ACG"
[20] "ACT" "CCA" "CCC" "CCG" "CCT" "GCA" "GCC" "GCG" "GCT" "TCA" "TCC" "TCG" "TCT" "ACA" "ACC" "ACG" "ACT" "CCA" "CCC"
[39] "CCG" "CCT" "GCA" "GCC" "GCG" "GCT" "TCA" "TCC" "TCG" "TCT" "ATA" "ATC" "ATG" "ATT" "CTA" "CTC" "CTG" "CTT" "GTA"
[58] "GTC" "GTG" "GTT" "TTA" "TTC" "TTG" "TTT" "ATA" "ATC" "ATG" "ATT" "CTA" "CTC" "CTG" "CTT" "GTA" "GTC" "GTG" "GTT"
[77] "TTA" "TTC" "TTG" "TTT" "ATA" "ATC" "ATG" "ATT" "CTA" "CTC" "CTG" "CTT" "GTA" "GTC" "GTG" "GTT" "TTA" "TTC" "TTG"
[96] "TTT"
i.e. these labels:
[1] "ACA>AAA" "ACC>AAC" "ACG>AAG" "ACT>AAT" "CCA>CAA" "CCC>CAC" "CCG>CAG" "CCT>CAT" "GCA>GAA" "GCC>GAC" "GCG>GAG"
[12] "GCT>GAT" "TCA>TAA" "TCC>TAC" "TCG>TAG" "TCT>TAT" "ACA>AGA" "ACC>AGC" "ACG>AGG" "ACT>AGT" "CCA>CGA" "CCC>CGC"
[23] "CCG>CGG" "CCT>CGT" "GCA>GGA" "GCC>GGC" "GCG>GGG" "GCT>GGT" "TCA>TGA" "TCC>TGC" "TCG>TGG" "TCT>TGT" "ACA>ATA"
[34] "ACC>ATC" "ACG>ATG" "ACT>ATT" "CCA>CTA" "CCC>CTC" "CCG>CTG" "CCT>CTT" "GCA>GTA" "GCC>GTC" "GCG>GTG" "GCT>GTT"
[45] "TCA>TTA" "TCC>TTC" "TCG>TTG" "TCT>TTT" "ATA>AAA" "ATC>AAC" "ATG>AAG" "ATT>AAT" "CTA>CAA" "CTC>CAC" "CTG>CAG"
[56] "CTT>CAT" "GTA>GAA" "GTC>GAC" "GTG>GAG" "GTT>GAT" "TTA>TAA" "TTC>TAC" "TTG>TAG" "TTT>TAT" "ATA>ACA" "ATC>ACC"
[67] "ATG>ACG" "ATT>ACT" "CTA>CCA" "CTC>CCC" "CTG>CCG" "CTT>CCT" "GTA>GCA" "GTC>GCC" "GTG>GCG" "GTT>GCT" "TTA>TCA"
[78] "TTC>TCC" "TTG>TCG" "TTT>TCT" "ATA>AGA" "ATC>AGC" "ATG>AGG" "ATT>AGT" "CTA>CGA" "CTC>CGC" "CTG>CGG" "CTT>CGT"
[89] "GTA>GGA" "GTC>GGC" "GTG>GGG" "GTT>GGT" "TTA>TGA" "TTC>TGC" "TTG>TGG" "TTT>TGT"
whereas the documentation has this:
# Human genome trinucleotide frequencies (from EMu)
freq <- c(1.14e+08, 6.60e+07, 1.43e+07, 9.12e+07, # C>A @ AC[ACGT]
1.05e+08, 7.46e+07, 1.57e+07, 1.01e+08, # C>A @ CC[ACGT]
8.17e+07, 6.76e+07, 1.35e+07, 7.93e+07, # C>A @ GC[ACGT]
1.11e+08, 8.75e+07, 1.25e+07, 1.25e+08, # C>A @ TC[ACGT]
1.14e+08, 6.60e+07, 1.43e+07, 9.12e+07, # C>G @ AC[ACGT]
1.05e+08, 7.46e+07, 1.57e+07, 1.01e+08, # C>G @ CC[ACGT]
8.17e+07, 6.76e+07, 1.35e+07, 7.93e+07, # C>G @ GC[ACGT]
1.11e+08, 8.75e+07, 1.25e+07, 1.25e+08, # C>G @ TC[ACGT]
1.14e+08, 6.60e+07, 1.43e+07, 9.12e+07, # C>T @ AC[ACGT]
1.05e+08, 7.46e+07, 1.57e+07, 1.01e+08, # C>T @ CC[ACGT]
8.17e+07, 6.76e+07, 1.35e+07, 7.93e+07, # C>T @ GC[ACGT]
1.11e+08, 8.75e+07, 1.25e+07, 1.25e+08, # C>T @ TC[ACGT]
1.17e+08, 7.57e+07, 1.04e+08, 1.41e+08, # T>A @ AC[ACGT]
7.31e+07, 9.55e+07, 1.15e+08, 1.13e+08, # T>A @ CC[ACGT]
6.43e+07, 5.36e+07, 8.52e+07, 8.27e+07, # T>A @ GC[ACGT]
1.18e+08, 1.12e+08, 1.07e+08, 2.18e+08, # T>A @ TC[ACGT]
1.17e+08, 7.57e+07, 1.04e+08, 1.41e+08, # T>C @ AC[ACGT]
7.31e+07, 9.55e+07, 1.15e+08, 1.13e+08, # T>C @ CC[ACGT]
6.43e+07, 5.36e+07, 8.52e+07, 8.27e+07, # T>C @ GC[ACGT]
1.18e+08, 1.12e+08, 1.07e+08, 2.18e+08, # T>C @ TC[ACGT]
1.17e+08, 7.57e+07, 1.04e+08, 1.41e+08, # T>G @ AC[ACGT]
7.31e+07, 9.55e+07, 1.15e+08, 1.13e+08, # T>G @ AC[ACGT]
6.43e+07, 5.36e+07, 8.52e+07, 8.27e+07, # T>G @ AG[ACGT]
1.18e+08, 1.12e+08, 1.07e+08, 2.18e+08) # T>G @ AT[ACGT]
kgori commented
Hi gevro,
Well spotted, the latter half of the commented labels are wrong. They should follow the pattern AT*,CT*,GT*,TT*
. We will fix this in the next release.
Thanks for the report,
Kevin