Malformed VCF output in snp.raw.vcf files
wac opened this issue · 1 comments
In both versions 1.0.0 and 1.0.1, I have been getting VCF lines that have 19 columns rather than the expected 10 columns, as well as blank links. It appears that some lines are concatenated together (where the chromosome of the subsequent line is concatenated with the subsequent line. I was running BisSNP in multithreaded mode with quite a few threads (usually 10 or 23) - it does look like something to do with the I/O. A couple examples follow:
This one has two malformed lines in a row that are also a duplicate of one another followed by a blank line
6 168841154 . A G 74.88 PASS DP=16;MQ0=0;NS=1;SB=-0.0007 GT:BQ:BRC6:CM:CP:CU:DP:DP4:GP:GQ:SS 0/1:35.9,NaN:0,0,2,0,0,14:.:.:.:.:2,14,0,0:75,0,76:75:56 167467433 rs1358883 G C 54.53 PASS CS=-;DB;DP=15;MQ0=0;NS=1;REF=C,CH;SB=-0.1779GT:BQ:BRC6:CM:CP:CU:DP:DP4:GP:GQ:SS 0/1:36.6,37.0:0,0,9,3,0,2:.:.:.:.:9,0,0,3:55,0,408:55:5
6 168841154 . A G 74.88 PASS DP=16;MQ0=0;NS=1;SB=-0.0007 GT:BQ:BRC6:CM:CP:CU:DP:DP4:GP:GQ:SS 0/1:35.9,NaN:0,0,2,0,0,14:.:.:.:.:2,14,0,0:75,0,76:75:56 167467433 rs1358883 G C 54.53 PASS CS=-;DB;DP=15;MQ0=0;NS=1;REF=C,CH;SB=-0.1779GT:BQ:BRC6:CM:CP:CU:DP:DP4:GP:GQ:SS 0/1:36.6,37.0:0,0,9,3,0,2:.:.:.:.:9,0,0,3:55,0,408:55:5
And another example with a malformed line followed by a correctly formatted line that duplicates part of the malformed line followed by a blank line
11 119559423 . G A 56.45 PASS CS=-;DP=11;MQ0=0;NS=1;REF=C,CH;SB=-0.0000 GT:BQ:BRC6:CM:CP:CU:DP:DP4:GP:GQ:SS 0/1:35.5,35.7:0,0,11,0,0,0:.:.:.:.:8,0,3,0:56,0,292:56:511 118901796 . T C 33.69 PASS DP=13;MQ0=0;NS=1;SB=-32.7049 GT:BQ:BRC6:CM:CP:CU:DP:DP4:GP:GQ:SS 0/1:37.0,37.0:0,7,0,2,4,0:.:.:.:.:7,4,0,2:34,0,152:34:5
11 119559423 . G A 56.45 PASS CS=-;DP=11;MQ0=0;NS=1;REF=C,CH;SB=-0.0000 GT:BQ:BRC6:CM:CP:CU:DP:DP4:GP:GQ:SS 0/1:35.5,35.7:0,0,11,0,0,0:.:.:.:.:8,0,3,0:56,0,292:56:5
Also, this issue does not appear to occur in the older (0.82.2) so perhaps it is something in the interface with the new GATK?
I got the same question.
The line with the wrong columns which seems likes two lines merged together. And the next line will be an empty line.
chr1 1186408 . C . 34.76 PASS CS=+;Context=WCG;DP=1;MQ0=0;NS=1;REF=0 GT:BQ:BRC6:CM:CP:CU:DP:DP4:GP:GQ:SS 0/0:38.0,NaN:1,0,0,0,0,0:1:WCG:0:.:1,0,0,0:0,35,78:35:5
chr1 1320899 . G . 37.77 PASS CS=-;Context=GCH;DP=2;MQ0=0;NS=1;REF=0 GT:BQ:BRC6:CM:CP:CU:DP:DP4:GP:GQ:SS 0/0:36.0,NaN:0,0,0,2,0,0:0:GCH:0:.:2,0,0,0:0,38,116:38:5chr1 1573517 31.75 PASS CS=-;Context=HCH;DP=1;MQ0=0;NS=1;REF=0 GT:BQ:BRC6:CM:CP:CU:DP:DP4:GP:GQ:SS 0/0:39.0:0,1,0,0,0,0:0:HCH:1:.:0,0,0,1:0,32,35:32:5
chr1 1320899 . G . 37.77 PASS CS=-;Context=GCH;DP=2;MQ0=0;NS=1;REF=0 GT:BQ:BRC6:CM:CP:CU:DP:DP4:GP:GQ:SS 0/0:36.0,NaN:0,0,0,2,0,0:0:GCH:0:.:2,0,0,0:0,38,116:38:5chr1 1573517 31.75 PASS CS=-;Context=HCH;DP=1;MQ0=0;NS=1;REF=0 GT:BQ:BRC6:CM:CP:CU:DP:DP4:GP:GQ:SS 0/0:39.0:0,1,0,0,0,0:0:HCH:1:.:0,0,0,1:0,32,35:32:5