tjunier/newick_utils

Explicit (un)rootedness

Opened this issue · 3 comments

Imagine we have a rooted tree (1,(2,3),(4,5));. There doesn't seem to be a way to specify that this is a rooted tree, since the multifurcation at the root is interpreted by the newick utils as an indication of unrootedness.

[A contrastive example is when a tree of the form (1,((2,3),(4,5))); is to be treated as unrooted.]

It would be very helpful to have an extra argument for the tools: a switch, whether a tree is to be explicitly considered as rooted or unrooted.

As far as I can see, all programs would treat the tree (1,(2,3),(4,5)); as unrooted, so it is a general problem. Maybe add an addition (fake) outgroup? And I do not know how the tree (1,((2,3),(4,5))); could be interpreted as unrooted (which would be the first tree (1,(2,3),(4,5));).

According to the format specification rootedness is not explicitly coded in the newick notation (actually Felsenstein's first example on that page is a rooted tree with a multifuctation at the root). That means that it is impossible to tell if a tree is rooted or not solely based on newick. The guessing trick works very well most of the time, but for root-sensitive applications one needs an extra flag for the parser.

The tree (1,((2,3),(4,5))); is unrooted if we know it is unrooted or want to treat it as such. [One particularly common scenario is when the program which generated the tree by default outputs trees "rooted" on the first taxon, which is e.g. the case with the phylip package.]

While for the cases like (1,((2,3),(4,5))); one can just do nw_reroot -d, there is no such workaround for (1,(2,3),(4,5));-like trees.

As you probably know, in the nexus tree format (built on top of newick) this problem is solved exactly by using a flag (R/U).