charite/jannovar

Insertions are not annotated for splicing

dmb107 opened this issue · 3 comments

Describe the bug
I am not sure if this is a bug or intentional design, but I am noticing that insertion variants are not annotated for splicing information. I do not see the relevant splicing annotation logic in the InsertionAnnotationBuilder that I see in the DeletionAnnotationBuilder:

if (so.overlapsWithSpliceDonorSite(changeInterval))
varTypes.add(VariantEffect.SPLICE_DONOR_VARIANT);
else if (so.overlapsWithSpliceAcceptorSite(changeInterval))
varTypes.add(VariantEffect.SPLICE_ACCEPTOR_VARIANT);
else if (so.overlapsWithSpliceRegion(changeInterval))
varTypes.add(VariantEffect.SPLICE_REGION_VARIANT);
// Check whether the variant overlaps with the stop site.
if (so.overlapsWithTranslationalStopSite(changeInterval))
varTypes.add(VariantEffect.STOP_LOST);

Is Jannovar intentionally not considering splicing for insertions?

To Reproduce
Here is a left-aligned insertion that I expect to receive splice_region_variant based on the genomic position.
Screen Shot 2022-07-12 at 2 30 11 PM

annotate-var/refseq/hg19/chrX/108924495/C/CTTA

[
	{
		"transcriptId": "NM_001318509.2",
		"variantEffects": [
			"disruptive_inframe_insertion"
		],
		"isCoding": true,
		"hgvsProtein": "p.(L212_K213insN)",
		"hgvsNucleotides": "c.636_638dup"
	},
	{
		"transcriptId": "NM_001318510.2",
		"variantEffects": [
			"disruptive_inframe_insertion"
		],
		"isCoding": true,
		"hgvsProtein": "p.(L171_K172insN)",
		"hgvsNucleotides": "c.513_515dup"
	},
	{
		"transcriptId": "NM_004458.3",
		"variantEffects": [
			"disruptive_inframe_insertion"
		],
		"isCoding": true,
		"hgvsProtein": "p.(L171_K172insN)",
		"hgvsNucleotides": "c.513_515dup"
	},
	{
		"transcriptId": "NM_022977.3",
		"variantEffects": [
			"disruptive_inframe_insertion"
		],
		"isCoding": true,
		"hgvsProtein": "p.(L212_K213insN)",
		"hgvsNucleotides": "c.636_638dup"
	}
]

Additional context
I ran VEP online to compare and found that the insertion is annotated with splice_region_variant.
Screen Shot 2022-07-12 at 2 31 05 PM

Example variant with region chr1:23685940-23696357

SNV with splice_region_variant (positive control)

{
	"assembly": "GRCh37",
	"ref": "T",
	"alt": "C",
	"contig": "chr1",
	"start": 23693536,
	"stop": 23693536,
	"variant_type": "snv",
	"annotators": ["Jannovar"]
}

"data": {
			"transcriptId": "NM_001077195.2",
			"source": "refseq",
			"variantEffects": [
				"splice_region_variant",
				"synonymous_variant"
			],
			"isCoding": true,
			"hgvsProtein": "p.(=)",
			"hgvsNucleotides": "c.159A>G"
		},

Insertion within splice_region_variant boundaries without annotation:

{
	"assembly": "GRCh37",
	"ref": "T",
	"alt": "TCA",
	"contig": "chr1",
	"start": 23693536,
	"stop": 23693537,
	"variant_type": "ins",
	"annotators": ["Jannovar"]
}
		"data": {
			"transcriptId": "NM_001077195.2",
			"source": "refseq",
			"variantEffects": [
				"frameshift_elongation"
			],
			"isCoding": true,
			"hgvsProtein": "p.(D54Efs*11)",
			"hgvsNucleotides": "c.158_159insTG"
		},

Good catch, thank you.

@holtgrewe we're happy to contribute!