liuhc8/Aperture

Output vcf file does not conform to vcf standard (int vs float)

gshiba opened this issue · 0 comments

Output vcf file does not conform to vcf standard. This can be observed in the example data in this repo.

The header section declares BAR as an Integer:

$ zcat example/test_toyindex_ap12.sv.vcf.gz | grep =BAR
##FORMAT=<ID=BAR,Number=1,Type=Integer,Description="Count of cfDNA molecules supporting the breakpoint">

but the records contain a float:

$ zcat example/test_toyindex_ap12.sv.vcf.gz | grep -v ^# | head -n1 | cut -f9,10
GT:SR:PE:REFSR:VARSR:BAR:UBAR   ./.:1:0:1:1:1.0:1

Note the 1.0 in the BAR (second last) field.

This causes tools like bcftools to fail:

$ bcftools version
bcftools 1.14-8-gdc009e5
Using htslib 1.14-7-g1d79f44
[...]

$ bcftools view example/test_toyindex_ap12.sv.vcf.gz 
[...]
##bcftools_viewVersion=1.14-8-gdc009e5+htslib-1.14-7-g1d79f44
##bcftools_viewCommand=view example/test_toyindex_ap12.sv.vcf.gz; Date=Wed Nov 24 19:05:52 2021
#CHROM  POS     ID      REF     ALT     QUAL    FILTER  INFO    FORMAT  test_smallref_ap12
[W::vcf_parse_info] INFO 'PRECISE' is not defined in the header, assuming Type=String
[E::vcf_parse_format] Invalid character '.' in 'BAR' FORMAT field at chr21:10521527
Error: VCF parse error