Mapping rate decreasing with time

Question

Mapping rate decreasing with time

Opened this issue 2 years ago · 3 comments

Hello,

I have a question about something I noticed.
Initially, wfmash aligns very fast, but then the number of aligned bp per second slowly declines.
This continues until it is almost done, with the last 10% taking the most time.
I assume this happens because easy to align sequences are dispatched fast, until all the working threads are occupied by difficult to align sequences that take a long time (low complexity sequence)?

Is there a way to make wfmash spend less time on such sequences and output a rougher alignment and save time?

Answer 1 · 2023-06-14T11:06:46.000Z

Hi @cgroza, I am getting similar problems with harder species to align like potato, primates, etc... I am working on reducing such a problem.

However, do you think you can share just a few sequence pairs that are the slowest to be aligned for you? I would like to verify that the high runtime is due to the same reasons I've seen in our tests.

Answer 2 · 2023-06-14T11:07:43.000Z

Ah, I am assuming you meant alignment rate and not mapping rate.

Answer 3 · 2023-06-14T19:32:05.000Z

Yes that's correct.
In my case, these were primate genomes assembled with nanopore and polished with short reads.
I suspect the culprits are low complexity sequences.