eldariont/svim

Underflow on variant position?

Closed this issue · 5 comments

Hi,

I'm using SVIM v1.4.1, and I notice that for some of the random contigs and unplaced contigs (e.g. chr5_GL000208v1_random, chrUn_GL000226v1) I get surprisingly high coordinates, suspiciously always 4294967296 or 2^32. Building a normal tbi tabix index breaks for variants like that.

Below are some examples, I can share the full VCF for this sample if you want:

chr5_GL000208v1_random  4294967296      svim.BND.130629 N       ]chr5:47079731]N        7       PASS    SVTYPE=BND;SUPPORT=6;STD_POS1=.;STD_POS2=1.63   GT:DP:AD        ./.:.:.,.
chr17_KI270729v1_random 4294967296      svim.BND.311456 N       [chrX:66676029[N        1       PASS    SVTYPE=BND;SUPPORT=1;STD_POS1=.;STD_POS2=.      GT:DP:AD        ./.:.:.,.
chr22_KI270736v1_random 4294967296      svim.BND.346279 N       ]chr22_KI270736v1_random:111780]N       1       PASS    SVTYPE=BND;SUPPORT=1;STD_POS1=.;STD_POS2=.      GT:DP:AD        ./.:.:.,.
chrEBV  4294967296      svim.DUP_TANDEM.6691    N       <DUP:TANDEM>    1       not_fully_covered       SVTYPE=DUP:TANDEM;END=51803;SVLEN=51803;SUPPORT=1;STD_SPAN=.;STD_POS=.  GT:CN:DP:AD     ./.:2:.:.,.
chrUn_GL000226v1        4294967296      svim.DUP_TANDEM.6833    N       <DUP:TANDEM>    1       PASS    SVTYPE=DUP:TANDEM;END=15008;SVLEN=15008;SUPPORT=1;STD_SPAN=.;STD_POS=.  GT:CN:DP:AD     ./.:2:.:.,.
chrUn_KI270435v1        4294967296      svim.BND.347861 N       [chrY:10657300[N        1       PASS    SVTYPE=BND;SUPPORT=1;STD_POS1=.;STD_POS2=.      GT:DP:AD        ./.:.:.,.
chrUn_KI270435v1        4294967296      svim.BND.347860 N       ]chr16:34065991]N       2       PASS    SVTYPE=BND;SUPPORT=2;STD_POS1=.;STD_POS2=.      GT:DP:AD        ./.:.:.,.
chrUn_KI270590v1        4294967296      svim.DUP_TANDEM.6993    N       <DUP:TANDEM>    4       PASS    SVTYPE=DUP:TANDEM;END=2914;SVLEN=2914;SUPPORT=4;STD_SPAN=3.95;STD_POS=1.65      GT:CN:DP:AD     ./.:2:.:.,.

I checked the length of chr5_GL000208v1_random and that's only 92kb. So something is off here :)

Cheers,
Wouter

Hi Wouter,

thanks for reporting this issue. I have observed a similar issue when SVIM erroneously outputs a VCF record with POS=0 (although the VCF spec require POS to be greater than 0). For some reason, bcftools replaces these wrong POS fields with values of 2^32. I have already fixed the underlying issue causing POS=0 in the output VCF with the following commit: 3c8915a.

Can you confirm that the original VCF output from SVIM contains POS fields with 0 instead of 2^32? If this is the case, could you please reprocess your sample with the current master of SVIM instead of v1.4.1? If this fixes the issue, I can upload the current master as v1.4.2 to pypi and bioconda.

Cheers
David

Hi David,

I can confirm this happens with bcftools sort but not with bcftools view:

diff -y --suppress-common-lines <(cat variants.vcf) <(bcftools view variants.vcf) | grep 4294967296 # returns nothing
diff -y --suppress-common-lines <(cat variants.vcf) <(bcftools sort variants.vcf) | grep 4294967296
Writing to /tmp/bcftools-sort.IChFRc
Merging 1 temporary files
Cleaning
Done
chr5_GL000208v1_random	0	svim.BND.130634	N	]chr5 |	chr5_GL000208v1_random	4294967296	svim.BND.130634	N
chr17_KI270729v1_random	0	svim.BND.311465	N	[chrX |	chr17_KI270729v1_random	4294967296	svim.BND.311465	N
chr22_KI270736v1_random	0	svim.BND.346285	N	]chr2 |	chr22_KI270736v1_random	4294967296	svim.BND.346285	N
chrEBV	0	svim.DUP_TANDEM.6676	N	<DUP:TANDEM>  |	chrEBV	4294967296	svim.DUP_TANDEM.6676	N	<DUP:
chrUn_GL000226v1	0	svim.DUP_TANDEM.6818	N     |	chrUn_GL000226v1	4294967296	svim.DUP_TANDEM.6818
chrUn_KI270435v1	0	svim.BND.347865	N	]chr1 |	chrUn_KI270435v1	4294967296	svim.BND.347866	N
chrUn_KI270435v1	0	svim.BND.347866	N	[chrY |	chrUn_KI270435v1	4294967296	svim.BND.347865	N
chrUn_KI270590v1	0	svim.DUP_TANDEM.6971	N     |	chrUn_KI270590v1	4294967296	svim.DUP_TANDEM.6971

I'll install it from git and report back to you.

Cheers,
Wouter

The version on GitHub seems to be okay!

Thanks a lot, Wouter, for checking and sorry for the hassle.
I just released SVIM v1.4.2 so this bug should be fixed also on bioconda soon.

Cheers,
David

I've merged the changes, thanks again!
bioconda/bioconda-recipes#24743