Questions about the results text output.
Closed this issue · 3 comments
Hi Jody,
Output from NTM Profiler version 0.2.1 is below:
A few questions:
1) I noticed that the gene name is not listed in the "Resistance variants report" and "Other variants report" sections. Could the gene name be added to these sections?
2) What is meant to be reported in the "Resistance genes report" section?
3) Could you explain what the 149.500 means in the "Mean Kmer Coverage" section?
4) Pipeline version is showing as 0.2.0 . Should this be 0.2.1 ?
Thanks very much for making this fantastic tool available to the community!
Michael
NTM-Profiler report
The following report has been generated by NTM-Profiler.
Summary
ID: SRR315352
Date: Wed Aug 17 16:10:18 2022
Species report
Species Mean Kmer Coverage
Mycobacterium abscessus subsp. massiliense 149.500
Resistance report
Drug Genotypic Resistance Mutations
Macrolides R rrl n.2270A>G (1.00)
Amikacin R rrs n.1375A>G (1.00)
Resistance genes report
Locus Tag Gene Drug
Resistance variants report
Genome Position Locus Tag Variant Type Change Estimated Fraction Drug
1463772 MAB_r5051 non_coding_transcript_exon_variant n.1375A>G 1.000 amikacin
1466477 MAB_r5052 non_coding_transcript_exon_variant n.2270A>G 1.000 macrolides
Other variants report
Genome Position Locus Tag Variant Type Change Estimated Fraction
1462247 MAB_r5051 upstream_gene_variant n.-151A>G 1.000
1462267 MAB_r5051 upstream_gene_variant n.-131_-130insA 1.000
1462275 MAB_r5051 upstream_gene_variant n.-123T>C 1.000
1463374 MAB_r5051 non_coding_transcript_exon_variant n.977C>T 1.000
1463968 MAB_r5052 upstream_gene_variant n.-240A>G 1.000
1464005 MAB_r5052 upstream_gene_variant n.-203_-202insC 1.000
1464183 MAB_r5052 upstream_gene_variant n.-25G>A 1.000
1464841 MAB_r5052 non_coding_transcript_exon_variant n.634C>T 1.000
1465924 MAB_r5052 non_coding_transcript_exon_variant n.1717A>G 0.964
1467208 MAB_r5052 non_coding_transcript_exon_variant n.3001T>C 1.000
2345756 MAB_2297 upstream_gene_variant c.-199T>C 1.000
2345783 MAB_2297 upstream_gene_variant c.-172C>T 0.992
2345889 MAB_2297 upstream_gene_variant c.-66C>T 1.000
2345891 MAB_2297 upstream_gene_variant c.-64A>G 1.000
2345896 MAB_2297 upstream_gene_variant c.-59A>G 1.000
2345927 MAB_2297 upstream_gene_variant c.-28A>G 1.000
2345951 MAB_2297 upstream_gene_variant c.-4C>T 0.989
2345995 MAB_2297 missense_variant p.Pro14Gln 1.000
2346000 MAB_2297 missense_variant p.Thr16Ala 1.000
2346014 MAB_2297 frameshift_variant c.64_65delCG 0.855
2346039 MAB_2297 missense_variant p.Val29Phe 1.000
2346044 MAB_2297 synonymous_variant c.90C>T 1.000
2346063 MAB_2297 missense_variant p.Asp37Asn 1.000
2346077 MAB_2297 synonymous_variant c.123A>G 1.000
2346392 MAB_2297 synonymous_variant c.438A>C 0.970
2346420 MAB_2297 missense_variant p.Ala156Thr 0.990
Coverage report
Gene Locus_Tag Cutoff Fraction
rrs MAB_r5051 0 0.000
rrl MAB_r5052 0 0.000
erm(41) MAB_2297 0 0.287
Missing positions report
N/A
Analysis pipeline specifications
Pipeline version: 0.2.0
Species Database version: N/A
Resistance Database version: Mycobacterium_abscessus_subsp._massiliense_93c979b_Jody Phelan jody.phelan@lshtm.ac.uk_Fri Apr 29 17:33:42 2022 +0100
Analysis Program
Kmer counting kmc
Mapping bwa
Variant calling freebayes_
Hi @harrismia
Thanks for using the tool!
- I noticed that the gene name is not listed in the "Resistance variants report" and "Other variants report" sections. Could the gene name be added to these sections?
Yes that is definitely possible. I'll add that into the next release (which should be within the next week or two).
- What is meant to be reported in the "Resistance genes report" section?
There are some cases of genes causing resistance to a particular drug. For example - the erm(41) gene in MAB subspecies confers inducible resistance to macrolides [1,2]. In this case, the presence of an in-tact gene leads to resistance. As many of the columns of the drug-resistance variants section such as frequency, change and type don't really fit with a gene, we decided to make a new section specifically for resistance genes.
Resistance genes report
-----------------
Locus Tag Gene Drug
MAB_2297 erm(41) macrolides
- Could you explain what the 149.500 means in the "Mean Kmer Coverage" section?
The way species are predicted is by the detection of specific kmers that have been found to be exclusive to certain species/subspecies. Just in case there are mutations or deletions in some isolates for the specific kmers we have selected 20 kmers per species. The "mean kmer coverage" is the mean count found across the 20 kmers. In general - it depends on the sequencing depth but in general I guess higher values will give more confidence to the prediction. It can also potentially give a rough idea on the proportions of mixed infections - although I haven't validated it for this use!
Species report
-----------------
Species Mean Kmer Coverage
Mycobacterium abscessus subsp. massiliense 36.050
Mycobacterium abscessus subsp. abscessus 21.900
- Pipeline version is showing as 0.2.0 . Should this be 0.2.1 ?
You're right, this should be showing v0.2.1. I'll update for the next release!
Thanks again for the feedback, and let me know if there are any more questions!
Thanks very much for the information and for developing this valuable tool!
No problem, glad it's useful!