baoxingsong/AnchorWave

Does the result maf file contain one-to-one alignment?

Zhuxitong opened this issue · 2 comments

Hi, @baoxingsong

I see that in Mummer or Last process, we need extra steps to filter the results and retain only the one-to-one alignment results:

The minimap2, MUMmer4, and GSalign results were filtered to one-to-one alignments using the “last-split | maf-swap | last-split | maf-swap”

However, I didn't notice if there are same process in anhcorWave. Thus I got several questions:

  1. Dose it mean the alignment result from anchorWave has already been one-to-one?
  2. Or for different algorithms (3 in total, command 4), some results maybe one-to-one but others maybe many-to-many (including one-to-many and many-to-one)?
  3. if not one-to-one results, can we use “last-split | maf-swap | last-split | maf-swap” mentioned above to obtain the one-to-one results in anchorWave?

Looking forward to your helpful insights!

Thanks.
AnchorWave implemented three algorithms:

  1. Longest-path approach for genome sequences without inversions or rearrangements.
  2. Longest path considering inversions.
  3. Longest path considering inversions, rearrangements, and WGDs.
    The first two algorithms are implemented as the genoAli command and the third one is implemented as the proali command.
    The first two algorithms always output one-to-one alignment. We could control the alignment depth for the third algorithm via -R and -Q.

The algorithms implemented AnchorWave are very different from the algorithms implemented in MUMmer4, and GSalign. Please make sure you are familiar with your input data. And it is highly recommended to read our document or publication to understand how to pick up AnchorWave parameters.

Thanks @baoxingsong

AnchorWave is really different from those whole genome alignment tools I ever used. And I didn't realize the this even after I read through the docment and your papers plus the supplements. What you have explained really make me better understand how anchorWave works and how whole genome alignment differs from reads mapping.

Many thanks!