Errors on annotating
Closed this issue · 1 comments
Hi there,
I followed your conda installation instruction and tested your example dataset and all worked OK.
I have a tumour cram file as per #19 it should work.
However I encountered two issues:
Here are my commands:
svclone annotate -i $SV_INPUT -b $TUMOUR_ALIGNMENT_FILE -s $SAMPLE_NAME --config $CONFIG --blacklist $BLACKLIST --sv_format simple -o $OUTPUT_DIR
where $TUMOUR_ALIGNMENT_FILE is a cram file.
My errors are:
Loading SV calls...
Supplied blacklist is not a valid bed file of intervals
Insert mean of 12883.375620, with standard deviation of 1645065.318974 inferred
WARNING: anomalous insert sizes detected. Please
double check or consider setting values manually.
Recalibrating consensus alignments...
Warning: record E00170:290:HV5GVCCXX:1:1106:10439:27943 contains invalid attributes, skipping
Warning: record E00170:290:HV5GVCCXX:6:1115:18629:62312 contains invalid attributes, skipping
Warning: record E00170:290:HV5GVCCXX:6:1220:15341:37067 contains invalid attributes, skipping
head encode4_GRCh38_blacklist.bed
chr1 628903 635104
chr1 5850087 5850571
chr1 8909610 8910014
chr1 9574580 9574997
chr1 32043823 32044203
chr1 33818964 33819344
chr1 38674335 38674715
chr1 50017081 50017546
chr1 52996949 52997329
chr1 55372488 55372869
head NYGC21T1_svs_simple.txt
chr1 pos1 dir1 chr2 pos2 dir2 classification
chr1 22735865 - chr1 22735931 + DUP
chr1 22735931 + chr1 22735865 - DUP
chr1 43260563 - chr1 43260641 + DUP
chr1 43260641 + chr1 43260563 - DUP
chr1 147720214 - chr1 147720256 + DUP
chr1 147720256 + chr1 147720214 - DUP
chr1 158433099 - chr11 86540641 - INTRX
chr4 19434216 + chr8 139313961 + INTRX
chr5 736194 - chr5 8359157 - INV
samtools view NYGC21T-ready.cram | head
E00170:290:HV5GVCCXX:1:1104:24911:19188 145 chr1 9996 0 112S39M chrUn_KI270750v1 68845 0 TCTTCACACCCTCACAAGCCAACACCAGAGCTCACACACCAACATTTTTTAATGATACGGCGCCCACCGAGACCTACACACTGACGCTCACCCTTTCCCTACCCCTCGCCCTTCCGATCACCCTAACCCTAACCCTAACCCTAACCCTAAC -'//),6-(((.(-(,/)((,,(-(6/(0((.5,),6;(),,),-----,+>?.,,-$501A03EBD0(/79(>;D>=/,(,4+$(5,=D?(>D-CDDCCA(%?(.#':'?D.?,(+?(-CCBB=7>@BBECCCBAEA?C@;ECC8CD/BA XA:Z:chr21,-37835803,117S34M,0;chrX,+156030583,34M117S,0;chr4,+190122910,33M118S,0;chr17,+83247362,33M118S,0;chr1,+248946309,33M118S,0;chr1,+248946223,33M118S,0;chr1,-180803,118S33M,0;chr4,-10000,118S33M,0;chrX,-222346,118S33M,0;chr16,-75334199,118S33M,0;chr15,+101981060,32M119S,0;chr1,-10353,119S32M,0;chr3,-10518,119S32M,0;chr7,-10010,119S32M,0;chr5,-10123,119S32M,0;chr5,-10363,119S32M,0;chr2,-181275795,119S32M,0;chr5,-10225,119S32M,0;chr5,-10141,119S32M,0;chr5,-10273,119S32M,0;chr5,-10213,119S32M,0;chr1,-10257,119S32M,0;chr5,-10237,119S32M,0;chr2,+240221871,32M119S,0;chr5,-10159,119S32M,0;chr5,-10381,119S32M,0;chr1,-10051,119S32M,0;chr5,-10087,119S32M,0;chr5,-10195,119S32M,0;chr5,-10105,119S32M,0;chr18,+80263025,32M119S,0;chr5,-10261,119S32M,0;chr5,-10417,119S32M,0;chr5,-10177,119S32M,0;chr5,-10003,119S32M,0;chr5,-10399,119S32M,0;chr5,-10267,119S32M,0;chr5,-10249,119S32M,0;chr10,+133787338,34M117S,1;chr7_KI270899v1_alt,-10,119S32M,0;chr7_KI270899v1_alt,-4,119S32M,0; MC:Z:102S49M MQ:i:14 AS:i:34 XS:i:34 mc:i:68743 ms:i:2308 MD:Z:0N0N0N0N0N1A32 NM:i:6 RG:Z:NYGC21T
E00170:277:HV3VLCCXX:1:1215:21846:54137 1145 chr1 10000 0 43S108M = 10000 0 TGCTATTCTGGCACGACGCCAAGGGAAGCCTCTGGCGCAATCTATAACCCTAACCCTAACCCTAACCCTAACCCTAACCCTAACCCTAACCCTAACCCTAACCCTAACCCTAACCCTAACCCTAACCCTAACCCTAACCCTAACCCTAACC 3EE=,9A:BFEEB6EE5EEDGFEEDFFDD?BCCDD5DDFCFCCDAFDDDCCA;DDCCFDDDC>EBDDCCFDDDCCEADCCCECCCCBECCCBBBACCBBECCCBBEACCBBE?CCBBECCCABECCCBBDCCCBBECCCBBECCCCDGDC? AS:i:108 XS:i:107 mc:i:9999 ms:i:2745 MD:Z:0N107 NM:i:1 RG:Z:NYGC21T
E00170:277:HV3VLCCXX:1:1215:21846:54137 181 chr1 10000 0 * = 10000 0 CTGCCCGCCCGCGACTGCCATGGGGGTGGGGGGTGCGTGTCGGCGGGGCTGCGTGTGGACGCGCCTGGGGGAGAAACGCGGAGAGAAGGGATTACGGAGGGGGGGTATTGTGGTAGATGGGGTAGGGAGTGGGGTGAAGGGATGTTTCCTT +.<))(?(1#>%@.(/=((-/$7%A5/=2%2%;/=$;/4.*%>.5%5&(/;$9.9/5'.$<#;((/A%EDA:D7=.$@$DD0A0A,<DDD8-89)AD0DDDDDD?>C8D?;C?,ACBCCCCABDCCC9ACCCCACCDDBCCCC?;?06=B= MC:Z:43S108M MQ:i:0 AS:i:0 XS:i:0 mc:i:10107 ms:i:4580 RG:Z:NYGC21T
E00170:277:HV3VLCCXX:1:2108:31172:54225 99 chr1 10000 0 99S49M3S = 10351 377 ACCCTAACCCTAACCCTAACCCTAAGATCGGAAGAGCACACGTCTGAACTCCAGTCACATTCAGAAATCCCGTATGCCGTCTTCTGCTTGAAAAAAACCATAACCCTAACCCTAACCCTAACCCGAACCCTAACCCTGACGCTAACCCGAA ?BCDFCEACCDBEACCDBEACBDBEBCCC0?;:@EB>CABA2>CCBEE>1CCC>;CCACB<CC>BEE5C*<-6,:6A@-/D2BDD8ACBCF>>>.=>8:0,5D8>D1+DBA;@>-?5>12C?DB$<78B:1+-'5B<).4%816.4*6%., XA:Z:chr22,-50808000,51M100S,3;chr4,+10125,100S48M3S,2;chr3,-198173424,14S37M100S,0;chr15,-101981111,3S54M94S,4;chr20,-64287309,3S54M94S,4;chr12,+10230,100S48M3S,3;chr18,+10161,100S48M3S,3;chr4,-190122947,3S48M100S,3;chr1,-248946197,3S48M100S,3;chr3,+10287,100S48M3S,3;chr22,-44626526,14S42M95S,2;chr3_KI270784v1_alt,+62100,100S37M14S,0;chr12_GL877875v1_alt,+230,100S48M3S,3; MC:Z:125S26M MQ:i:0 AS:i:34 XS:i:38 mc:i:10376 ms:i:3152 MD:Z:0N24T12A2C7 NM:i:4 RG:Z:NYGC21T
E00170:277:HV3VLCCXX:1:2202:30949:13615 99 chr1 10000 0 148M3S = 10002 87 CTAACCCTAACCCTAACCCTAACCCTAACCCTAACCCTAACCCTAACCCTAACCCTAACCCTAACCCTAACCCTAACCCTAACCCTAACCCTAACCCTAACCCTAACCCAACCCTAACCCTAACCCCAACCCTAACCCTAACCCTAACACT ?BBFCCCDBEACCDBEACCDBDACCDBEABCDBEACCDBEACCCBEACCDBEACCDBEACCEBEACCBBEADCCCEADD;CD?D;0>955D@CB?D21,B5@*1=D9?DD?'5;1:-=:>:>:?2:*>@?A21+-0<*0<.;=5C@-(/52 MC:Z:66S85M MQ:i:0 AS:i:142 XS:i:136 mc:i:10086 ms:i:2964 MD:Z:0N125T21 NM:i:2 RG:Z:NYGC21T
E00170:277:HV3VLCCXX:1:2224:11820:11769 65 chr1 10000 0 111M40S chr22 50807946 0 CTAACCCTAACCCTAACCCTAACCCTAACCCTAACCCTAACCCTAACCCTAACCCTAACCCTAACCCTAACCCTAACCCTAACCCTAACCCTAACCCTAACCCTAACCCAAGCCCGAACCCTAGCCCCAACCCAACCCCAACCCAAACACC ?BCFCCCDBEACCDBEACCDBDACCDBEABCDBEACCDBEACCDBEACCDBEACCDBEACCEBEACCEBCADCEBB?DDDC>?DDECEBDD1=9BDD>C78@D@A7'2)/-))))#=-/9A1+()B**8-55:8?'35*..'3=/.-(/&8 MC:Z:109M42S MQ:i:0 AS:i:110 XS:i:109 mc:i:50807946 ms:i:2462 MD:Z:0N110 NM:i:1 RG:Z:NYGC21T
E00170:277:HV3VLCCXX:2:1116:27174:62945 99 chr1 10000 0 97S54M = 10047 72 CCTAACCCTAACCCTAACCCTAACCCAGACCGGAAGCGCAACCGTAAGAACTCCAGTCACATTCAGAAAACCCGTATCCCGTCTCCTGCTCGAAACAATAACCCTAACCCTAACCCTAACCCTAACCCTAACCCTAACCCTAACCCCAACC BGEDGACCDBEACCDBEACCDBEACCC'E1A&'EE''&2B.&<&@-.(//&2*20(A*CA@6B)C2/---&(3(AB/(<D./)1)2@)(2)'/@F'/-.,7?):C;F?DAFCFB;D?C?BDBFDG0B21+94B3<D,:BE1,B;C))/B:< XA:Z:chr18,+10161,98S53M,0;chr22,-50808380,53M98S,0;chr4,-190122718,53M98S,0;chr15,-101981035,53M98S,0;chr1,-248946192,53M98S,0;chr1,-248946276,4S49M98S,0;chr21,-46699868,4M1D49M98S,1;chr7,-159335880,4M1D49M98S,1;chr18,-80262915,4M1D49M98S,1;chr20,-64287308,4S49M98S,0;chr1,-248946346,4S49M98S,0;chr13,-114354160,4M1D49M98S,1;chr1,+10186,98S49M1D4M,1;chr5,+11736,98S49M1D4M,1;chr7,+152897984,98S53M,1;chr17,+113282,98S48M5S,0;chr4,-190122829,5S48M98S,0;chr17,-83247349,45M106S,0;chr6,+147867,95S40M1D16M,2;chr17_GL383563v3_alt,+53282,98S48M5S,0; MC:Z:125S25M1S MQ:i:0 AS:i:49 XS:i:53 mc:i:10072 ms:i:2546 MD:Z:0N48T4 NM:i:2 RG:Z:NYGC21T
E00170:277:HV3VLCCXX:3:1124:13464:3577 99 chr1 10000 0 121S30M = 10052 133 GGCGGGGAGCATACGGGGGGCAGATGTAAAGACAATGAGGAACGGCATAGCGCGCGACGTGCCGCCGTCTCGGACCCTTGCTATTCTGGCACGACGCCAAGGGAAGCCTCTGGCGCACTCTATAACCCTAACCCTAACCCTAACCCTAACC BEC0CC;C@2C208.C?A?@CC@ECB99EDBB9CE/0/>>;C=2>CCCB::-91&%B=2ABA?1CC3;9CA%(/5)<0>(3ECD+@D>DD@B%CB3D@DFC88A-(ADDD;>13.6D%1*/+?>->E4ADG>BEE,B'44E8ECF5E,DC< MC:Z:70S81M MQ:i:0 AS:i:30 XS:i:30 mc:i:10132 ms:i:1868 MD:Z:0N29 NM:i:1 RG:Z:NYGC21T
E00170:277:HV3VLCCXX:4:1102:21420:19153 1145 chr1 10000 0 86S65M = 10000 0 TGAGGAACGGCATAGCGCGCGACGTGCCGCCGTCTCGGACCCTTGCTATTCTGGCACGACGCCAAGGGAAGCCTCTGGCGCAATCTATAACCCTAACCCTAACCCTAACCCTAACCCTAACCCTAACCCTAACCCTAACCCTAACCCTAAC FEGGFHF5FFEFDFE4E5E4EE4CEDE3DD5?GCA5DDDDDCFDDCCDFFCDDDDD3DD4DD>EEDDDEEDDCFBCCD4DCECEBCCBB@CCBCECCCBBECCCBBECCCBBE@CCBBB8CBBBECCCBBECCCBBECCCBBECCCCDEBB AS:i:65 XS:i:64 mc:i:9999 ms:i:4142 MD:Z:0N64 NM:i:1 RG:Z:NYGC21T
E00170:277:HV3VLCCXX:4:1102:21420:19153 181 chr1 10000 0 * = 10000 0 TCTGCGTGTGCACGCGCCTGTGGGAGAAACGCGGAGAGAAGGGATTACGGAGGGGGGGTATTGTGGTAGATGGGGTAGGGAGTGGGGTGAAGGGATGTTTCCTTTGTTAGTATTTTGCAGCGCTGCTTAATTTTTTTTCCTAGTTGCCATG -:/E+=/DA;(,,9#A/1CA?EB@0;+,.3E5E@CEGAEFDC@/F=A2AD7DDDDDD>CEEECEE=@@A7EEDBA@FDDAC8;;DDBDDFEADDDDBFFFDCFFDBFAEACDEEECCCDC2CBCCBEBECEEEEEEDEBBBDAECCDCBBA MC:Z:86S65M MQ:i:0 AS:i:0 XS:i:0 mc:i:10064 ms:i:4618 RG:Z:NYGC21T
I wonder if you could help me troubleshoot?
Many thanks in advance!
Al
Looks like SVclone couldn't estimate your insert mean/standard deviation accurately. This will cause anomalous results downstream.
Please try setting these values manually in the config file (based on what you'd expect for your sequencing experiment) and try running the pipeline again. E.g.:
# read length of BAM file; -1 = infer dynamically.
read_len: 100
# Mean fragment length (also known as insert length); -1 = infer dynamically.
insert_mean: 300
# Standard deviation of insert length; -1 = infer dynamically.
insert_std: 30