odelaneau/shapeit4

Losing ~60% of variants from sequence data with SHAPEIT

klmartinez opened this issue · 2 comments

I have a sequenced locus on chromosome 9 that I am attempting to phase with SHAPEIT. My original vcf file has 1617 variants. After phasing with SHAPEIT I am left with only 513 variants. I made sure to check the quality of my vcf with checkVCF and found no duplicates or reference mismatches.

I use the following code to run SHAPEIT:

shapeit4.2 --input my_data.vcf.gz --map chr9.b37.gmap.gz --region 9 --reference 1000g_phase3_nomulti_allelic/ALL.chr9.phase3_shapeit2_mvncall_integrated_v5b.20130502.genotypes.vcf.gz --thread 8 --log shapeit_chr9.log --output my_data_phased_SHAPEIT.vcf --mcmc-iterations 10b,1p,1b,1p,1b,1p,1b,1p,10m --pbwt-depth 8 &

Is there a way to minimize the loss of variants? Or is there something I am doing wrong that may be resulting in the huge loss of variants.?

Hi,

I bet you loose variants because they are not in the reference panel (position + alleles).

When using a reference panel, you can only phase sites in the overlap.

Best,

Olivier.

Sorry, I can't understand, why remove the mutation when phasing the mutation?
I want to know