Illumina/strelka

inconsistent reproducibility of variants

anoronh4 opened this issue · 0 comments

We are finding some reproducibility issues when running Strelka2 to find germline and somatic variants. One issue is the base counting for tier 1,2 variants. The fields AU:CU:GU:TU are often blank in one run but filled in in another run. Take for example the following variant:
Run 1:

19	8615160	.	G	T	.	LowEVS	SOMATIC;QSS=1;TQSS=1;NT=ref;QSS_NT=1;TQSS_NT=1;SGT=GG->GG;DP=180;MQ=60;MQ0=0;ReadPosRankSum=-0.17;SNVSB=0;SomaticEVS=1.14;EVSF=1,1,0.024793,60,0,0,-0.16669,-1.2052,0,0,32,42,0,0	DP:FDP:SDP:SUBDP:AU:CU:GU:TU	56:0:0:0:0,0:0,0:56,57:0,0	121:0:0:0:0:.:.:.

Run 2:

19	8615160	.	G	T	.	LowEVS	SOMATIC;QSS=1;TQSS=1;NT=ref;QSS_NT=1;TQSS_NT=1;SGT=GG->GG;DP=180;MQ=60;MQ0=0;ReadPosRankSum=-0.17;SNVSB=0;SomaticEVS=1.14;EVSF=1,1,0.024793,60,0,0,-0.16669,-1.2052,0,0,32,42,0,0	DP:FDP:SDP:SUBDP:AU:CU:GU:TU	56:0:0:0:0,0:0,0:56,57:0,0	121:0:0:0:0,0:0,0:118,120:3,3

We also found that the second run has 13 less total variants than the second. We have noticed this kind of inconsistency in other samples. We want to know if this is expected and if there's any way to ensure reproducibility or better catch errors.