matsengrp/linearham

Inferred ancestral sequences have mutations in ambiguous regions

Opened this issue · 0 comments

If I pass in seqs that all have Ns for, say, the first 50 bases of V (because that area wasn't sequenced), the inferred ancestral sequences that linearham gives me have mutations within this region. Which maybe is ok, but also seems kind of weird? Like on the one hand, in reality the real biological sequences probably had some mutation there, and since we know what the naive bases are there, it makes sense to have them in the naive sequence -- and then maybe also the inferred intermediates? But otoh it's confusing to see mutations listed that correspond to an unsampled region of the sequences, and that, well, are really just made-up mutations that have no relation to the data.

top: data
bottom: inferred ancestors
p