Feature: Create functions to get reference and alternate bases from `Variation`
MillironX opened this issue · 0 comments
Expected behavior
There should be two new functions refbases(v::Variation{S,T})
and altbases(v::Variation{S,T})
which will return the reference or alternate bases of v
as a S where {S <: BioSequence}
. These functions should follow the VCF specification for representing alternates:
For simple insertions and deletions in which either the REF or one of the ALT alleles would otherwise be null/empty, the REF and ALT Strings must include the base before the event ..., unless the event occurs at position 1 on the contig in which case it must include the base after the event
From the spec examples:
refbases(Variation(dna"ATCGA", "C3G")) == dna"C"
altbases(Variation(dna"ATCGA", "C3G")) == dna"G"
refbases(Variation(dna"ATCGA", "Δ3-3")) == dna"TC"
altbases(Variation(dna"ATCGA", "Δ3-3")) == dna"T"
refbases(Variation(dna"ATCGA", "3A")) == dna"C"
altbases(Variation(dna"ATCGA", "3A")) == dna"CA"
Current behavior
There is none
Possible implementation
This should be fairly straightforward using leftposition
and rightposition
functions.
Context
These functions provide a snapshot of what changed in a Variation
. They will allow trivial export to VCF format if interchange is required (my primary use case).