Bioconductor/Biostrings

Bug: length of PairwiseAlignments

gevro opened this issue · 2 comments

gevro commented

Hi,
Per the documentation, length should return the lengths of each alignment of a PairwiseAlignments object

General information methods
In the code snippets below, x is a PairwiseAlignments object, except otherwise noted.

alphabet(x): Equivalent to alphabet(unaligned(subject(x))).

length(x): The common length of alignedPattern(x) and alignedSubject(x). There is a method for PairwiseAlignmentsSingleSubjectSummary as well.

However, the latest version of Biostrings simply returns the total number of alignments within the PairwiseAlignments object.

I believe this is a bug. Also, there does not seem to be a function to do what the documentation indicates.

Hi,

However, the latest version of Biostrings simply returns the total number of alignments within the PairwiseAlignments object.

This has always been the case and is the expected behavior of length() on a PairwiseAlignments object:

library(Biostrings)
pattern <- AAStringSet(c("PAWHEAE", "PAWHAE", "PAWWHEAE"))
subject <- AAString("HEAGAWGHEE")
x <- pairwiseAlignment(pattern, subject)
x
length(x)
# [1] 3

Also I don't see any contradiction with the documentation:

alignedPattern(x)
# AAStringSet object of length 3:
#     width seq
# [1]    10 P---AWHEAE
# [2]    10 P---AW-HAE
# [3]    11 P---AWWHEAE

length(alignedPattern(x))
# [1] 3

alignedSubject(x)
# AAStringSet object of length 3:
#     width seq
# [1]    10 HEAGAWGHEE
# [2]    10 HEAGAWGHEE
# [3]    11 HEAGAWGHE-E

length(alignedSubject(x))
# [1] 3

Use nchar() if you want the lengths of each alignment in the PairwiseAlignments object:

nchar(x)
# [1] 10 10 11
gevro commented

Ok thank you.