mcmero/SVclone

Errors on annotating

Closed this issue · 1 comments

Hi there,
I followed your conda installation instruction and tested your example dataset and all worked OK.
I have a tumour cram file as per #19 it should work.
However I encountered two issues:
Here are my commands:

svclone annotate -i $SV_INPUT -b $TUMOUR_ALIGNMENT_FILE -s $SAMPLE_NAME --config $CONFIG --blacklist $BLACKLIST --sv_format simple -o $OUTPUT_DIR

where $TUMOUR_ALIGNMENT_FILE is a cram file.

My errors are:

Loading SV calls...
Supplied blacklist is not a valid bed file of intervals
Insert mean of 12883.375620, with standard deviation of 1645065.318974 inferred
WARNING: anomalous insert sizes detected. Please 
              double check or consider setting values manually.
Recalibrating consensus alignments...
Warning: record E00170:290:HV5GVCCXX:1:1106:10439:27943 contains invalid attributes, skipping
Warning: record E00170:290:HV5GVCCXX:6:1115:18629:62312 contains invalid attributes, skipping
Warning: record E00170:290:HV5GVCCXX:6:1220:15341:37067 contains invalid attributes, skipping
head encode4_GRCh38_blacklist.bed
chr1	628903	635104
chr1	5850087	5850571
chr1	8909610	8910014
chr1	9574580	9574997
chr1	32043823	32044203
chr1	33818964	33819344
chr1	38674335	38674715
chr1	50017081	50017546
chr1	52996949	52997329
chr1	55372488	55372869

head NYGC21T1_svs_simple.txt 
chr1	pos1	dir1	chr2	pos2	dir2	classification
chr1	22735865	-	chr1	22735931	+	DUP
chr1	22735931	+	chr1	22735865	-	DUP
chr1	43260563	-	chr1	43260641	+	DUP
chr1	43260641	+	chr1	43260563	-	DUP
chr1	147720214	-	chr1	147720256	+	DUP
chr1	147720256	+	chr1	147720214	-	DUP
chr1	158433099	-	chr11	86540641	-	INTRX
chr4	19434216	+	chr8	139313961	+	INTRX
chr5	736194	-	chr5	8359157	-	INV

samtools view NYGC21T-ready.cram | head
E00170:290:HV5GVCCXX:1:1104:24911:19188	145	chr1	9996	0	112S39M	chrUn_KI270750v1	68845	0	TCTTCACACCCTCACAAGCCAACACCAGAGCTCACACACCAACATTTTTTAATGATACGGCGCCCACCGAGACCTACACACTGACGCTCACCCTTTCCCTACCCCTCGCCCTTCCGATCACCCTAACCCTAACCCTAACCCTAACCCTAAC	-'//),6-(((.(-(,/)((,,(-(6/(0((.5,),6;(),,),-----,+>?.,,-$501A03EBD0(/79(>;D>=/,(,4+$(5,=D?(>D-CDDCCA(%?(.#':'?D.?,(+?(-CCBB=7>@BBECCCBAEA?C@;ECC8CD/BA	XA:Z:chr21,-37835803,117S34M,0;chrX,+156030583,34M117S,0;chr4,+190122910,33M118S,0;chr17,+83247362,33M118S,0;chr1,+248946309,33M118S,0;chr1,+248946223,33M118S,0;chr1,-180803,118S33M,0;chr4,-10000,118S33M,0;chrX,-222346,118S33M,0;chr16,-75334199,118S33M,0;chr15,+101981060,32M119S,0;chr1,-10353,119S32M,0;chr3,-10518,119S32M,0;chr7,-10010,119S32M,0;chr5,-10123,119S32M,0;chr5,-10363,119S32M,0;chr2,-181275795,119S32M,0;chr5,-10225,119S32M,0;chr5,-10141,119S32M,0;chr5,-10273,119S32M,0;chr5,-10213,119S32M,0;chr1,-10257,119S32M,0;chr5,-10237,119S32M,0;chr2,+240221871,32M119S,0;chr5,-10159,119S32M,0;chr5,-10381,119S32M,0;chr1,-10051,119S32M,0;chr5,-10087,119S32M,0;chr5,-10195,119S32M,0;chr5,-10105,119S32M,0;chr18,+80263025,32M119S,0;chr5,-10261,119S32M,0;chr5,-10417,119S32M,0;chr5,-10177,119S32M,0;chr5,-10003,119S32M,0;chr5,-10399,119S32M,0;chr5,-10267,119S32M,0;chr5,-10249,119S32M,0;chr10,+133787338,34M117S,1;chr7_KI270899v1_alt,-10,119S32M,0;chr7_KI270899v1_alt,-4,119S32M,0;	MC:Z:102S49M	MQ:i:14	AS:i:34	XS:i:34	mc:i:68743	ms:i:2308	MD:Z:0N0N0N0N0N1A32	NM:i:6	RG:Z:NYGC21T
E00170:277:HV3VLCCXX:1:1215:21846:54137	1145	chr1	10000	0	43S108M	=	10000	0	TGCTATTCTGGCACGACGCCAAGGGAAGCCTCTGGCGCAATCTATAACCCTAACCCTAACCCTAACCCTAACCCTAACCCTAACCCTAACCCTAACCCTAACCCTAACCCTAACCCTAACCCTAACCCTAACCCTAACCCTAACCCTAACC	3EE=,9A:BFEEB6EE5EEDGFEEDFFDD?BCCDD5DDFCFCCDAFDDDCCA;DDCCFDDDC>EBDDCCFDDDCCEADCCCECCCCBECCCBBBACCBBECCCBBEACCBBE?CCBBECCCABECCCBBDCCCBBECCCBBECCCCDGDC?	AS:i:108	XS:i:107	mc:i:9999	ms:i:2745	MD:Z:0N107	NM:i:1	RG:Z:NYGC21T
E00170:277:HV3VLCCXX:1:1215:21846:54137	181	chr1	10000	0	*	=	10000	0	CTGCCCGCCCGCGACTGCCATGGGGGTGGGGGGTGCGTGTCGGCGGGGCTGCGTGTGGACGCGCCTGGGGGAGAAACGCGGAGAGAAGGGATTACGGAGGGGGGGTATTGTGGTAGATGGGGTAGGGAGTGGGGTGAAGGGATGTTTCCTT	+.<))(?(1#>%@.(/=((-/$7%A5/=2%2%;/=$;/4.*%>.5%5&(/;$9.9/5'.$<#;((/A%EDA:D7=.$@$DD0A0A,<DDD8-89)AD0DDDDDD?>C8D?;C?,ACBCCCCABDCCC9ACCCCACCDDBCCCC?;?06=B=	MC:Z:43S108M	MQ:i:0	AS:i:0	XS:i:0	mc:i:10107	ms:i:4580	RG:Z:NYGC21T
E00170:277:HV3VLCCXX:1:2108:31172:54225	99	chr1	10000	0	99S49M3S	=	10351	377	ACCCTAACCCTAACCCTAACCCTAAGATCGGAAGAGCACACGTCTGAACTCCAGTCACATTCAGAAATCCCGTATGCCGTCTTCTGCTTGAAAAAAACCATAACCCTAACCCTAACCCTAACCCGAACCCTAACCCTGACGCTAACCCGAA	?BCDFCEACCDBEACCDBEACBDBEBCCC0?;:@EB>CABA2>CCBEE>1CCC>;CCACB<CC>BEE5C*<-6,:6A@-/D2BDD8ACBCF>>>.=>8:0,5D8>D1+DBA;@>-?5>12C?DB$<78B:1+-'5B<).4%816.4*6%.,	XA:Z:chr22,-50808000,51M100S,3;chr4,+10125,100S48M3S,2;chr3,-198173424,14S37M100S,0;chr15,-101981111,3S54M94S,4;chr20,-64287309,3S54M94S,4;chr12,+10230,100S48M3S,3;chr18,+10161,100S48M3S,3;chr4,-190122947,3S48M100S,3;chr1,-248946197,3S48M100S,3;chr3,+10287,100S48M3S,3;chr22,-44626526,14S42M95S,2;chr3_KI270784v1_alt,+62100,100S37M14S,0;chr12_GL877875v1_alt,+230,100S48M3S,3;	MC:Z:125S26M	MQ:i:0	AS:i:34	XS:i:38	mc:i:10376	ms:i:3152	MD:Z:0N24T12A2C7	NM:i:4	RG:Z:NYGC21T
E00170:277:HV3VLCCXX:1:2202:30949:13615	99	chr1	10000	0	148M3S	=	10002	87	CTAACCCTAACCCTAACCCTAACCCTAACCCTAACCCTAACCCTAACCCTAACCCTAACCCTAACCCTAACCCTAACCCTAACCCTAACCCTAACCCTAACCCTAACCCAACCCTAACCCTAACCCCAACCCTAACCCTAACCCTAACACT	?BBFCCCDBEACCDBEACCDBDACCDBEABCDBEACCDBEACCCBEACCDBEACCDBEACCEBEACCBBEADCCCEADD;CD?D;0>955D@CB?D21,B5@*1=D9?DD?'5;1:-=:>:>:?2:*>@?A21+-0<*0<.;=5C@-(/52	MC:Z:66S85M	MQ:i:0	AS:i:142	XS:i:136	mc:i:10086	ms:i:2964	MD:Z:0N125T21	NM:i:2	RG:Z:NYGC21T
E00170:277:HV3VLCCXX:1:2224:11820:11769	65	chr1	10000	0	111M40S	chr22	50807946	0	CTAACCCTAACCCTAACCCTAACCCTAACCCTAACCCTAACCCTAACCCTAACCCTAACCCTAACCCTAACCCTAACCCTAACCCTAACCCTAACCCTAACCCTAACCCAAGCCCGAACCCTAGCCCCAACCCAACCCCAACCCAAACACC	?BCFCCCDBEACCDBEACCDBDACCDBEABCDBEACCDBEACCDBEACCDBEACCDBEACCEBEACCEBCADCEBB?DDDC>?DDECEBDD1=9BDD>C78@D@A7'2)/-))))#=-/9A1+()B**8-55:8?'35*..'3=/.-(/&8	MC:Z:109M42S	MQ:i:0	AS:i:110	XS:i:109	mc:i:50807946	ms:i:2462	MD:Z:0N110	NM:i:1	RG:Z:NYGC21T
E00170:277:HV3VLCCXX:2:1116:27174:62945	99	chr1	10000	0	97S54M	=	10047	72	CCTAACCCTAACCCTAACCCTAACCCAGACCGGAAGCGCAACCGTAAGAACTCCAGTCACATTCAGAAAACCCGTATCCCGTCTCCTGCTCGAAACAATAACCCTAACCCTAACCCTAACCCTAACCCTAACCCTAACCCTAACCCCAACC	BGEDGACCDBEACCDBEACCDBEACCC'E1A&'EE''&2B.&<&@-.(//&2*20(A*CA@6B)C2/---&(3(AB/(<D./)1)2@)(2)'/@F'/-.,7?):C;F?DAFCFB;D?C?BDBFDG0B21+94B3<D,:BE1,B;C))/B:<	XA:Z:chr18,+10161,98S53M,0;chr22,-50808380,53M98S,0;chr4,-190122718,53M98S,0;chr15,-101981035,53M98S,0;chr1,-248946192,53M98S,0;chr1,-248946276,4S49M98S,0;chr21,-46699868,4M1D49M98S,1;chr7,-159335880,4M1D49M98S,1;chr18,-80262915,4M1D49M98S,1;chr20,-64287308,4S49M98S,0;chr1,-248946346,4S49M98S,0;chr13,-114354160,4M1D49M98S,1;chr1,+10186,98S49M1D4M,1;chr5,+11736,98S49M1D4M,1;chr7,+152897984,98S53M,1;chr17,+113282,98S48M5S,0;chr4,-190122829,5S48M98S,0;chr17,-83247349,45M106S,0;chr6,+147867,95S40M1D16M,2;chr17_GL383563v3_alt,+53282,98S48M5S,0;	MC:Z:125S25M1S	MQ:i:0	AS:i:49	XS:i:53	mc:i:10072	ms:i:2546	MD:Z:0N48T4	NM:i:2	RG:Z:NYGC21T
E00170:277:HV3VLCCXX:3:1124:13464:3577	99	chr1	10000	0	121S30M	=	10052	133	GGCGGGGAGCATACGGGGGGCAGATGTAAAGACAATGAGGAACGGCATAGCGCGCGACGTGCCGCCGTCTCGGACCCTTGCTATTCTGGCACGACGCCAAGGGAAGCCTCTGGCGCACTCTATAACCCTAACCCTAACCCTAACCCTAACC	BEC0CC;C@2C208.C?A?@CC@ECB99EDBB9CE/0/>>;C=2>CCCB::-91&%B=2ABA?1CC3;9CA%(/5)<0>(3ECD+@D>DD@B%CB3D@DFC88A-(ADDD;>13.6D%1*/+?>->E4ADG>BEE,B'44E8ECF5E,DC<	MC:Z:70S81M	MQ:i:0	AS:i:30	XS:i:30	mc:i:10132	ms:i:1868	MD:Z:0N29	NM:i:1	RG:Z:NYGC21T
E00170:277:HV3VLCCXX:4:1102:21420:19153	1145	chr1	10000	0	86S65M	=	10000	0	TGAGGAACGGCATAGCGCGCGACGTGCCGCCGTCTCGGACCCTTGCTATTCTGGCACGACGCCAAGGGAAGCCTCTGGCGCAATCTATAACCCTAACCCTAACCCTAACCCTAACCCTAACCCTAACCCTAACCCTAACCCTAACCCTAAC	FEGGFHF5FFEFDFE4E5E4EE4CEDE3DD5?GCA5DDDDDCFDDCCDFFCDDDDD3DD4DD>EEDDDEEDDCFBCCD4DCECEBCCBB@CCBCECCCBBECCCBBECCCBBE@CCBBB8CBBBECCCBBECCCBBECCCBBECCCCDEBB	AS:i:65	XS:i:64	mc:i:9999	ms:i:4142	MD:Z:0N64	NM:i:1	RG:Z:NYGC21T
E00170:277:HV3VLCCXX:4:1102:21420:19153	181	chr1	10000	0	*	=	10000	0	TCTGCGTGTGCACGCGCCTGTGGGAGAAACGCGGAGAGAAGGGATTACGGAGGGGGGGTATTGTGGTAGATGGGGTAGGGAGTGGGGTGAAGGGATGTTTCCTTTGTTAGTATTTTGCAGCGCTGCTTAATTTTTTTTCCTAGTTGCCATG	-:/E+=/DA;(,,9#A/1CA?EB@0;+,.3E5E@CEGAEFDC@/F=A2AD7DDDDDD>CEEECEE=@@A7EEDBA@FDDAC8;;DDBDDFEADDDDBFFFDCFFDBFAEACDEEECCCDC2CBCCBEBECEEEEEEDEBBBDAECCDCBBA	MC:Z:86S65M	MQ:i:0	AS:i:0	XS:i:0	mc:i:10064	ms:i:4618	RG:Z:NYGC21T

I wonder if you could help me troubleshoot?
Many thanks in advance!
Al

Looks like SVclone couldn't estimate your insert mean/standard deviation accurately. This will cause anomalous results downstream.

Please try setting these values manually in the config file (based on what you'd expect for your sequencing experiment) and try running the pipeline again. E.g.:

# read length of BAM file; -1 = infer dynamically.
read_len: 100

# Mean fragment length (also known as insert length); -1 = infer dynamically.
insert_mean: 300

# Standard deviation of insert length; -1 = infer dynamically.
insert_std: 30