Losing ~60% of variants from sequence data with SHAPEIT
klmartinez opened this issue · 2 comments
I have a sequenced locus on chromosome 9 that I am attempting to phase with SHAPEIT. My original vcf file has 1617 variants. After phasing with SHAPEIT I am left with only 513 variants. I made sure to check the quality of my vcf with checkVCF and found no duplicates or reference mismatches.
I use the following code to run SHAPEIT:
shapeit4.2 --input my_data.vcf.gz --map chr9.b37.gmap.gz --region 9 --reference 1000g_phase3_nomulti_allelic/ALL.chr9.phase3_shapeit2_mvncall_integrated_v5b.20130502.genotypes.vcf.gz --thread 8 --log shapeit_chr9.log --output my_data_phased_SHAPEIT.vcf --mcmc-iterations 10b,1p,1b,1p,1b,1p,1b,1p,10m --pbwt-depth 8 &
Is there a way to minimize the loss of variants? Or is there something I am doing wrong that may be resulting in the huge loss of variants.?
Hi,
I bet you loose variants because they are not in the reference panel (position + alleles).
When using a reference panel, you can only phase sites in the overlap.
Best,
Olivier.
Sorry, I can't understand, why remove the mutation when phasing the mutation?
I want to know