lutteropp/NetRAX

Switch to Search in Waves

Opened this issue · 9 comments

In the whiteboard meeting today, we discovered some challenges with the current network topology search algorithm.

  • The arc insertion move neighborhood is way too large. Also, ideally one would allow each proposed arc insertion move to do some extra search with only horizontal moves, before ruling it out via BIC.
  • Instead, if we search in waves and fix the number of reticulations in each (horizontal) wave, we don't even need to directly find the best position for an arc insertion anymore. It will automatically get sorted out by the horizontal rearrangement moves later, avoiding that costly O(n^3) arc insertion neighborhood.

Done in dd6dd2a.

I entirely rewrote it in fa50575. NetRAX is still fighting with failing assertions here and there (some branch lengths issue, likely caused by me trying to switch to only use the branch_lengths array in network inference/ likely broke something there)...

But it already looking like this search strategy finds the network much faster. Also, it only has to start from raxml-ng best tree.

Search in waves rocks! :) Here a few more arguments for it, taken from the Slack channel:

  • It is faster than the naive algo

  • It works pretty well, even when used with just a single start tree

  • It is more likely to accept a reticulation (it gives it a fair chance by re-optimizing topology for the current number of reticulations before deciding. This is, it only checks via BIC after the current wave search finished)

  • We can use whatever likelihood function we want in there. Heck, we could even reintroduce that NEPAL likelihood function, to be used within each wave! And then we can use the (slower to compute) "correct" likelihood definition just whenever we have to decide between networks of different complexity...

I highly advocate for exclusively using the search in waves strategy, at least in the first paper. It has so much optimization potential! I especially like that it allows us to (as a heuristic) reuse the old blob optimization code etc (NEPAL likelihood) within the waves...

So far, search in waves just randomly adds a reticulation to the new network for the next wave (the within-wave horizontal moves are then used to reposition the reticulation). Later (for the second paper), we can try to speed this up by making use of the ideas from the whiteboard meeting (pdf attached) to find positions where a reticulation is likely more beneficial...
Whiteboard Meeting.pdf

Another reason why search in waves might be working so much more nicely right now: Only adding/removing reticulations messes up the pmatrix- and clv-indices (because they change number of nodes/edges). The search in waves could just be better at avoiding some bugs that can happen when not correctly taking care of these index changes...

So waves is the strategy in this drawing ?

Screenshot 2020-12-18 at 09 41 57

If yes, I totally agree that we go for this, and I thought that I had already advocated for it at the beginning of the project (in slack so I am not sure and we will never fund the info, github is super!)

It is also discussed in the "moves" paper with Fabio.

@celinescornavacca Yes! :) NetRAX fully switched to the search in waves now. I have also added an arrow back, to check whether an arc removal in the best found n+1-reticulations network leads to a better n-reticulation-network than what we encountered so far. If this is the case, redo the n+1-reticulation-search from that updated/improved n-reticulation network.