inconsistent reproducibility of variants
anoronh4 opened this issue · 0 comments
anoronh4 commented
We are finding some reproducibility issues when running Strelka2 to find germline and somatic variants. One issue is the base counting for tier 1,2 variants. The fields AU:CU:GU:TU are often blank in one run but filled in in another run. Take for example the following variant:
Run 1:
19 8615160 . G T . LowEVS SOMATIC;QSS=1;TQSS=1;NT=ref;QSS_NT=1;TQSS_NT=1;SGT=GG->GG;DP=180;MQ=60;MQ0=0;ReadPosRankSum=-0.17;SNVSB=0;SomaticEVS=1.14;EVSF=1,1,0.024793,60,0,0,-0.16669,-1.2052,0,0,32,42,0,0 DP:FDP:SDP:SUBDP:AU:CU:GU:TU 56:0:0:0:0,0:0,0:56,57:0,0 121:0:0:0:0:.:.:.
Run 2:
19 8615160 . G T . LowEVS SOMATIC;QSS=1;TQSS=1;NT=ref;QSS_NT=1;TQSS_NT=1;SGT=GG->GG;DP=180;MQ=60;MQ0=0;ReadPosRankSum=-0.17;SNVSB=0;SomaticEVS=1.14;EVSF=1,1,0.024793,60,0,0,-0.16669,-1.2052,0,0,32,42,0,0 DP:FDP:SDP:SUBDP:AU:CU:GU:TU 56:0:0:0:0,0:0,0:56,57:0,0 121:0:0:0:0,0:0,0:118,120:3,3
We also found that the second run has 13 less total variants than the second. We have noticed this kind of inconsistency in other samples. We want to know if this is expected and if there's any way to ensure reproducibility or better catch errors.