seqan/seqan3

alignment slice positions

notestaff opened this issue · 1 comments

Platform

  • SeqAn version: 3.3.0

Question

Is there a documented way to extract, from a pairwise alignment result, the alignment slice positions (equivalent to the coordinates[] list of BioPython alignment objects)?

From looking at the code, it seems that the trace result returned from aligned_sequence_builder() includes this info in first_sequence_slice_positions / second_sequence_slice_positions, but does not include them in the alignment result object?

Thanks for help!
@eseiler @rrahn

rrahn commented

Hi @notestaff, your observation is correct.
At the moment we have the infrastructure to allow user defined alignment outputs, but have it as an open TODO to incorporate this into the public configuration API of the alignments. Before we do so, however, we wanted to collect more information about what is actually needed. So your request is very helpful in that sense.
To achieve the same thing right now, you would need to write your own adapter wrapper around the alignment object which itself is just a std::tuple of two aligned_sequences. This simply means that the original source sequences are wrapped by a gap_decorator.
In that case you can iterate over the two gap_decorators and check with *it == seqan3::gap{} whether the currently referenced symbol is a gap or not. By adding some bookkeeping to track the last non-gap source position you could provide a similar interface as the BioPython coordinates.

Please let me know, if you need more information regarding this.
Best regards!