Illumina/strelka

unusual allele calls following indel

mcfiston opened this issue · 1 comments

Hi!

I have been running the Germline variant caller on some bam files and I am getting some strange allele calls. I seem to get a haploid allele call following every indel: See position 5140 into 5141 below for an example, but this seems to occur at every indel. Is this normal behaviour? I suppose that this is reflecting that position 5141 does not exist for the 2nd base pair of the allele without the insertion, but I don't understand why two alleles are specified (G & A at position 5141 in this example), and this formatting (only a single allele) is giving me errors in downstream analyses.

13	5139	.	T	.	.	PASS	END=5141;BLOCKAVG_min30p3a;AN=0	GT:GQX:DP:DPF:MIN_DP	./.:.:.:.:.
13	5140	.	TG	T	95	PASS	MQ=60;END=5151;BLOCKAVG_min30p3a;CIGAR=1M1D;RU=G;REFREP=2;IDREP=1;AC=1;AN=2	GT:GQ:GQX:DPI:AD:ADF:ADR:FT:PL:DP:DPF:MIN_DP	0/1:103:27:17:8,8:6,4:2,4:PASS:137,0,100:.:.:.
13	5141	.	G	A	281	PASS	SNVHPOL=3;MQ=60;AC=0;AN=1	GT:GQ:GQX:DP:DPF:AD:ADF:ADR:SB:FT:PL:PS:MIN_DP	0:.:94:7:2:.:.:.:.:.:.:.:7
13	5142	.	G	.	.	PASS	END=5227;BLOCKAVG_min30p3a;AN=2	GT:GQX:DP:DPF:MIN_DP	0/0:33:21:2:12

If this behaviour is expected, does Strelka have some simple way of filtering these sites out, or should I just do that separately? For reference, the commands I used to generate my variant calls are pasted below.

Many thanks!!

configureStrelkaGermlineWorkflow.py --callRegions chr-13.bed.gz --runDir ${RUNDIR} --reference ${REF} --bam ${BAM}Bhu118.1.RG.MarkDup.bam 
${RUNDIR}runWorkflow.py -m local -j 16

How do u get the "GT" field ? I can't get them