tresoldi/dafsa

Allow optional reordering of node indexes

Opened this issue · 1 comments

Due to minimization, it is common to have "gaps" in the indexes of the nodes, as 5 and 6 in the example below:

>>> print(DAFSA(["mama", "papa", "mapa"]))
DAFSA with 7 nodes and 8 edges (3 inserted sequences)
  +-- #0: 0(#1/3:<m>/2|#7/3:<p>/1) [('m', 1), ('p', 7)]
  +-- #1: n(#2/2:<a>/2) [('a', 2)]
  +-- #2: n(#3/2:<m>/1|#3/2:<p>/1) [('m', 3), ('p', 3)]
  +-- #3: n(#4/3:<a>/3) [('a', 4)]
  +-- #4: F() []
  +-- #7: n(#8/1:<a>/1) [('a', 8)]
  +-- #8: n(#3/1:<p>/1) [('p', 3)]

Indexes are supposed to be meaningless, and reordering after minimization is computationally expensive. Still, it is worth adding a flag that carries the reordering if so specified by the user.

As reordering is an expansive operation for cosmetic purposes, it makes sense to perform some sorting as well, such as placing final nodes at the end of the list, sorting same hierarchy levels alphabetically/by length, etc.