LoadError: GeneticVariation.VCF.Reader file format
Opened this issue · 1 comments
I place the following command:
viva -f result2.vcf -o output/directory/
Then I receive this error:
Welcome to VIVA.
Loading dependency packages:
┌ Warning: ORCA.jl has been deprecated and all savefig functionality
│ has been implemented directly in PlotlyBase itself.
│
│ By implementing in PlotlyBase.jl, the savefig routines are automatically
│ available to PlotlyJS.jl also.
└ @ ORCA ~/.julia/packages/ORCA/U5XaN/src/ORCA.jl:8
...
Finished loading packages!
Reading result2.vcf ...
No filters applied. Large vcf files will take a long time to process and heatmap visualizations will lose resolution at this scale unless viewed in interactive html for zooming.
Loading VCF file into memory for visualization
ERROR: LoadError: GeneticVariation.VCF.Reader file format error on line 51 ~>"++_NR=.;"
Stacktrace:
[1] error(::Type{T} where T, ::String, ::Int64, ::String, ::String) at ./error.jl:42
[2] _read!(::GeneticVariation.VCF.Reader, ::BioCore.Ragel.State{BufferedStreams.BufferedInputStream{IOStream}}, ::GeneticVariation.VCF.Record) at /Users/nathan/.julia/packages/BioCore/YBJvb/src/ReaderHelper.jl:164
[3] read!(::GeneticVariation.VCF.Reader, ::GeneticVariation.VCF.Record) at /Users/nathan/.julia/packages/BioCore/YBJvb/src/ReaderHelper.jl:134
[4] tryread!(::GeneticVariation.VCF.Reader, ::GeneticVariation.VCF.Record) at /Users/nathan/.julia/packages/BioCore/YBJvb/src/IO.jl:73
[5] iterate at /Users/nathan/.julia/packages/BioCore/YBJvb/src/IO.jl:84 [inlined] (repeats 2 times)
[6] top-level scope at /usr/local/bin/viva:237
[7] include(::Function, ::Module, ::String) at ./Base.jl:380
[8] include(::Module, ::String) at ./Base.jl:368
[9] exec_options(::Base.JLOptions) at ./client.jl:296
[10] _start() at ./client.jl:506
in expression starting at /usr/local/bin/viva:229
Line 51:
chr1 874570 . C A 209 Pass AF1=0.1339286;ALLELE_ORIGIN=@;AN=2;CLINICAL_SIGNIFICANCE=@;Category=2;DP=210;DP4=38,59,4,9;EFF=INTRON(MODIFIER||||SAMD11|mRNA|CODING|NM_152486|);GLOBAL_MAF=@;GTS=A/C,A/C;Gene_Description=879961;HGVS=(NC_000001.10:g.874570C>A,NM_152486.2:c.520+61C>A,);MQ=40;Observation=AMBIGUOUS;PUBMED_CITATIONS=0;SEL_PRIMARY_EFF=0;STDP4=38,59,4,9;Zygosity=LowAF;dbnsfp1000Gp1_AF=.;dbnsfp1000Gp1_AFR_AF=.;dbnsfp1000Gp1_AMR_AF=.;dbnsfp1000Gp1_ASN_AF=.;dbnsfp1000Gp1_EUR_AF=.;dbnsfp29way_logOdds=.;dbnsfpESP5400_AA_AF=.;dbnsfpEnsembl_transcriptid=.;dbnsfpGERP++_NR=.;dbnsfpGERP++_RS=.;dbnsfpInterpro_domain=.;dbnsfpSIFT_score=.;dbnsfpUniprot_acc=.;pValue=1.0E-209 GT:GQ:PL 1/1:10:0,255,255 1/1:10:0,255,255
Hi @nmhawkey,
Thanks for bringing this issue up. It looks like there is unexpected formatting (probably an unexpected special symbol like '+') on that line. The fastest way to get around this is to remove the offending line and produce a new "cleaned" file to visualize with a command like the GNU sed command below:
sed -i '51d' result2.vcf > result2.vcf_cleaned.vcf
This looks like an issue with the GeneticVariation.jl package which VIVA depends upon for reading in the VCF file. I would make an issue with them to correct this if you cannot run it with the cleaned version of the file with the offending line removed.
Let me know how it goes and if you have any more issues!
George