agshumate/Liftoff

Performance on fast-evolving sequences

Closed this issue · 3 comments

Hi,

When you are testing Liftoff, have you also looked at its performance of a gene family for which duplication events are frequent and sequences evolve quickly?

do you mean mapping extra gene copies from the reference assembly to the target where the sequences are not identical?

Yes, especially for those that show a low level of sequence identity ( below 40% for instance ). Did the entire gene structure, from ATG to stop codon, get typically captured?

We have not tested this specifically, but I imagine that genes with a sequence identity below 40% would not align well with minimap2 and therefore not be lifted over accurately. The intention of the -copies feature is to look for extra copies that are nearly identical to the reference.