calculation of snps to exclude LowQual positions
Opened this issue · 0 comments
Is your feature request related to a problem? Please describe
The total number of snp differences reported in the output multiqc plot includes low quality snps.
Sample Name | SNP |
---|---|
Leg180-11142022-KCH-D5512A | 15 |
#CHROM | POS | QUAL | FILTER |
---|---|---|---|
DACUAB010000005.1 | 133396 | 228 | PASS |
DACUAB010000005.1 | 133545 | 228 | LowQual |
DACUAB010000005.1 | 133669 | 228 | PASS |
DACUAB010000011.1 | 24 | 8.99921 | LowQual |
DACUAB010000011.1 | 27 | 8.99921 | LowQual |
DACUAB010000011.1 | 51 | 7.30814 | LowQual |
DACUAB010000011.1 | 54 | 7.30814 | LowQual |
DACUAB010000018.1 | 45824 | 30.4183 | LowQual |
DACUAB010000025.1 | 50907 | 6.51248 | LowQual |
DACUAB010000030.1 | 203 | 111 | LowQual |
DACUAB010000068.1 | 39 | 164 | LowQual |
DACUAB010000068.1 | 44 | 170 | LowQual |
DACUAB010000068.1 | 45 | 166 | LowQual |
DACUAB010000068.1 | 46 | 163 | LowQual |
DACUAB010000068.1 | 50 | 161 | LowQual |
Describe the solution you'd like
Can a second column in the multiqc plot include the number of snps that pass quality filters?
One possible option would be to use the --apply-filters flag (see below) but I have not confirmed what all other information gets used from the bcftools stats command in the multiqc plot.
bcftools stats -f PASS test.filtered.vcf.gz
So instead of the 15 reported snps reported in the multiqc plot, by using the filter function, the reported number of snps would be indicative of those that pass the bcftools filters that are applied (see below).
Sample Name | SNP |
---|---|
Leg180-11142022-KCH-D5512A | 2 |
But maybe this is not an issue for others, so unclear how much adding this feature is helpful for larger community.
Describe alternatives you've considered
At present, I only discovered this by visually examining some output.
Additional context
N/A