improve variant calling pipeline
k8hertweck opened this issue · 0 comments
k8hertweck commented
- add step to remove PCR errors (after read mapping, before SNP calling), using Picard MarkDuplicates
- filter out sites with extremely high depths of coverage (put vcf file into R, make distribution of depths, remove any sites that are 1.5 - 2x the median depth of coverage to get rid of the long tail)
- separate SNPs from indels after variant calling using SelectVariants in GATK
- accommodate possibility of false positive SNP calls around indels by using the mask option of FilterVariants with SelectVariants to remove SNPs within 6 bases of an indel