Latest versions of `vg deconstruct` have different wrong behaviors with graphs following the `PanSN-spec`
Closed this issue · 0 comments
AndreaGuarracino commented
With this graph
scerevisiae7.community.0.fa.gz.d1a145e.417fcdf.7493449.smooth.final.gfa.gz
where we follow the PanSN-spec, I obtain:
# vg 1.40.0
vg deconstruct -P S288C -H '#' -e -a -t 16 scerevisiae7.community.0.fa.gz.d1a145e.417fcdf.7493449.smooth.final.gfa | grep chrI | head -n 5
##contig=<ID=S288C#1#chrI,length=219929>
S288C#1#chrI 5 >8>10 CCCCA C 60 . AC=1;AF=1;AN=1;AT=>8>9>10,>8>10;NS=1;LV=0 GT . . . . 1 .
S288C#1#chrI 35 >10>13 A C 60 . AC=1;AF=1;AN=1;AT=>10>12>13,>10>11>13;NS=1;LV=0 GT . . . . 1 .
S288C#1#chrI 37 >13>16 C A 60 . AC=1;AF=1;AN=1;AT=>13>14>16,>13>15>16;NS=1;LV=0 GT . . . . 1 .
S288C#1#chrI 41 >16>19 A C 60 . AC=1;AF=1;AN=1;AT=>16>17>19,>16>18>19;NS=1;LV=0 GT . . . . 1 .
that is correct.
# vg 1.43.0
vg deconstruct -P S288C -H '#' -e -a -t 16 scerevisiae7.community.0.fa.gz.d1a145e.417fcdf.7493449.smooth.final.gfa | grep chrI | head -n 5
##contig=<ID=S288C#1#chrI#0,length=219929>
S288C#1#chrI#0 5 >8>10 CCCCA C 60 . AC=1;AF=1;AN=1;AT=>8>9>10,>8>10;NS=1;LV=0 GT . . . . 1 .
S288C#1#chrI#0 35 >10>13 A C 60 . AC=1;AF=1;AN=1;AT=>10>12>13,>10>11>13;NS=1;LV=0 GT . . . . 1 .
S288C#1#chrI#0 37 >13>16 C A 60 . AC=1;AF=1;AN=1;AT=>13>14>16,>13>15>16;NS=1;LV=0 GT . . . . 1 .
S288C#1#chrI#0 41 >16>19 A C 60 . AC=1;AF=1;AN=1;AT=>16>17>19,>16>18>19;NS=1;LV=0 GT . . . . 1 .
that is strange because it adds #0
at the end of the reference path name.
# vg 1.44.0
vg deconstruct -P S288C -H '#' -e -a -t 16 scerevisiae7.community.0.fa.gz.d1a145e.417fcdf.7493449.smooth.final.gfa | grep chrI | head -n 5
##contig=<ID=S288C#1#chrI#0,length=219929>
chrI 5 >8>10 CCCCA C 60 . AC=1;AF=1;AN=1;AT=>8>9>10,>8>10;NS=1;LV=0 GT . . . . 1 .
chrI 35 >10>13 A C 60 . AC=1;AF=1;AN=1;AT=>10>12>13,>10>11>13;NS=1;LV=0 GT . . . . 1 .
chrI 37 >13>16 C A 60 . AC=1;AF=1;AN=1;AT=>13>14>16,>13>15>16;NS=1;LV=0 GT . . . . 1 .
chrI 41 >16>19 A C 60 . AC=1;AF=1;AN=1;AT=>16>17>19,>16>18>19;NS=1;LV=0 GT . . . . 1 .
that is wrong because the CHROM
column (the 1st one) contains a value that is different from the contig ID specified above, where again there is the #0
suffix. This is leading to problems in pggb
, as described in pangenome/pggb#262 (comment).
How can I obtain the behavior of vg deconstruct 1.40.0
with the latest versions of vg
?