Reg: Absence of length in certain variation events
harish0201 opened this issue · 0 comments
harish0201 commented
Hi!
I was annotating the SV calls from Parliament2 when I realized that the Insertion events have 0 length in the tabulated output. Is there a way that we can add the Average length from the SV VCF?
Here is a minimal example
- VCF:
#CHROM POS ID REF ALT QUAL FILTER INFO FORMAT T8B
chr1 150880 DEL0011SUR N <DEL> 6 PASS SUPP=1;SUPP_VEC=00001;AVGLEN=66;SVTYPE=DEL;SVMETHOD=SURVIVORv2;CHR2=chr1;END=150946;CIPOS=0,0;CIEND=0,0;STRANDS=+-;CALLERS=MANTA GT:SP 0/1:MANTA
chr1 295428 TRA00406SUR N N[chr19:295428[> . PASS SUPP=1;SUPP_VEC=00001;AVGLEN=100000;SVTYPE=BND;SVMETHOD=SURVIVORv2;CHR2=chr19;END=295428;CIPOS=0,0;CIEND=0,0;STRANDS=++;CALLERS=MANTAGT:SP 0/1:MANTA
chr1 341697 INS0013SUR N <INS> . PASS SUPP=1;SUPP_VEC=00001;AVGLEN=59;SVTYPE=INS;SVMETHOD=SURVIVORv2;CHR2=chr1;END=341697;CIPOS=0,0;CIEND=0,0;STRANDS=+-;CALLERS=MANTA GT:SP 1/1:MANTA
chr1 341785 TRA0014SUR N N[chr7:72261928[> . PASS SUPP=1;SUPP_VEC=00001;AVGLEN=100000;SVTYPE=BND;SVMETHOD=SURVIVORv2;CHR2=chr7;END=72261928;CIPOS=0,0;CIEND=0,0;STRANDS=++;CALLERS=MANTAGT:SP 0/1:MANTA
chr1 380129 DEL0015SUR N <DEL> 6 PASS SUPP=4;SUPP_VEC=10111;AVGLEN=6001;SVTYPE=DEL;SVMETHOD=SURVIVORv2;CHR2=chr1;END=386165;CIPOS=-11,116;CIEND=-12,0;STRANDS=+-;CALLERS=BREAKDANCER,DELLY,LUMPY,MANTA GT:SP 1/1:BREAKDANCER,DELLY,LUMPY,MANTA
- SANSA output:
[1]ANNOID query.chr query.start query.chr2 query.end query.id query.qual query.svtype query.ct query.svlen query.startfeature query.endfeature query.containedfeature Gene Fusion
id000000004 chr1 150880 chr1 150946 DEL0011SUR 6 DEL 3to5 66 NA NA NA False
id000000005 chr1 295428 chr19 295428 TRA00406SUR 0 BND 3to3 0 NA NA NA False
id000000006 chr1 341697 chr1 341697 INS0013SUR 0 INS NtoN 0 NA NA NA False
id000000007 chr1 341785 chr7 72261928 TRA0014SUR 0 BND 3to3 0 NA NA NA False
id000000008 chr1 380129 chr1 386165 DEL0015SUR 6 DEL 3to5 6036 LOC100685782(0;-) LOC100685782(0;-) NA False
The last column is just my label to see if genes are fusing or not, so that can be ignored.
Would it be possible to add this average length of insertion to the SANSA annotation?