Filter reference segments from CNV caller output
Jakob37 opened this issue · 5 comments
Description of the bug
The CNV caller output segments contains, in addition to calls of duplications / deletions, the segments in between those. For these the allele is set to "0/0" and not marked deletion <DEL>
or duplication <DUP>
in the ALT
column.
I suspect this to be an error - we don't want to continue processing the ranges between the actual call ranges. They should probably be filtered out before being merged with the outputs of the other SV-callers.
An example output is seen below. In short - my guess is that we want to remove all where the ALT
column is .
rather than <DEL>
or <DUP>
.
#CHROM POS ID REF ALT QUAL FILTER INFO FORMAT giab_sample
chr1 792501 CNV_chr1_792501_1632500 N . 3076.53 . END=1632500 GT:CN:NP:QA:QS:QSE:QSS 0/0:2:825:9:3077:12:123
chr1 1632501 CNV_chr1_1632501_1635500 N <DEL> 29.40 . END=1635500 GT:CN:NP:QA:QS:QSE:QSS 0/1:1:3:11:29:16:12
chr1 1635501 CNV_chr1_1635501_1709500 N . 1829.22 . END=1709500 GT:CN:NP:QA:QS:QSE:QSS 0/0:2:72:4:1829:3:16
chr1 1710501 CNV_chr1_1710501_1712500 N <DEL> 18.93 . END=1712500 GT:CN:NP:QA:QS:QSE:QSS 0/1:1:2:9:19:37:3
chr1 1712501 CNV_chr1_1712501_1714500 N <DEL> 422.43 . END=1714500 GT:CN:NP:QA:QS:QSE:QSS 1/1:0:2:221:422:239:185
chr1 1714501 CNV_chr1_1714501_2121500 N . 3076.53 . END=2121500 GT:CN:NP:QA:QS:QSE:QSS 0/0:2:402:2:3077:26:34
chr1 2121501 CNV_chr1_2121501_2124500 N <DEL> 27.26 . END=2124500 GT:CN:NP:QA:QS:QSE:QSS 0/1:1:3:21:27:22:26
chr1 2124501 CNV_chr1_2124501_2651500 N . 3076.53 . END=2651500 GT:CN:NP:QA:QS:QSE:QSS 0/0:2:514:63:3077:53:22
chr1 2656501 CNV_chr1_2656501_2672500 N <DEL> 94.12 . END=2672500 GT:CN:NP:QA:QS:QSE:QSS 0/1:1:4:19:94:18:53
chr1 2674501 CNV_chr1_2674501_2675500 N <DUP> 2.49 . END=2675500 GT:CN:NP:QA:QS:QSE:QSS ./.:3:1:2:2:2:2
chr1 2677501 CNV_chr1_2677501_2683500 N . 32.24 . END=2683500 GT:CN:NP:QA:QS:QSE:QSS 0/0:2:3:10:32:18:8
Command used and terminal output
No response
Relevant files
No response
System information
No response
yeah this looks wrong. I'll look into it
Is this from the gatk or cnvnator?
This is from GATK
Sounds great @Jakob37 ❤️
@ramprasadn might be working on #442, so perhaps start on #444 until we can confirm with him