nf-core/bactmap

calculation of snps to exclude LowQual positions

Opened this issue · 0 comments

Is your feature request related to a problem? Please describe

The total number of snp differences reported in the output multiqc plot includes low quality snps.

Sample Name SNP
Leg180-11142022-KCH-D5512A 15
#CHROM POS QUAL FILTER
DACUAB010000005.1 133396 228 PASS
DACUAB010000005.1 133545 228 LowQual
DACUAB010000005.1 133669 228 PASS
DACUAB010000011.1 24 8.99921 LowQual
DACUAB010000011.1 27 8.99921 LowQual
DACUAB010000011.1 51 7.30814 LowQual
DACUAB010000011.1 54 7.30814 LowQual
DACUAB010000018.1 45824 30.4183 LowQual
DACUAB010000025.1 50907 6.51248 LowQual
DACUAB010000030.1 203 111 LowQual
DACUAB010000068.1 39 164 LowQual
DACUAB010000068.1 44 170 LowQual
DACUAB010000068.1 45 166 LowQual
DACUAB010000068.1 46 163 LowQual
DACUAB010000068.1 50 161 LowQual

Describe the solution you'd like

Can a second column in the multiqc plot include the number of snps that pass quality filters?
One possible option would be to use the --apply-filters flag (see below) but I have not confirmed what all other information gets used from the bcftools stats command in the multiqc plot.
bcftools stats -f PASS test.filtered.vcf.gz

So instead of the 15 reported snps reported in the multiqc plot, by using the filter function, the reported number of snps would be indicative of those that pass the bcftools filters that are applied (see below).

Sample Name SNP
Leg180-11142022-KCH-D5512A 2

But maybe this is not an issue for others, so unclear how much adding this feature is helpful for larger community.

Describe alternatives you've considered

At present, I only discovered this by visually examining some output.

Additional context

N/A