evogytis/baltic

commonAncestor returns wrong node when MRCA is root of tree

Opened this issue · 1 comments

If the MRCA of a set of tips given to the commonAncestor function is the root of the tree, then Baltic will sometimes return the node that is the parent of the root instead of the root itself, because both the root and the parent of the root have height 0, so the sorting functionality in the commonAncestor function is insufficient. This causes issue if you want to get some statistics, such as the tMRCA, which is not defined at the parent of the tree root. I think this can be fixed by replacing the current function with the following, which stops the iterating at the root of the tree instead of going all the way to the root parent (there's probably a better way to do this). This function does assume that there is a parent node of the MRCA of all tips in the tree, but I think that is guaranteed by the way Baltic reads in trees.

def commonAncestor(self,descendants,strict=False):
        """
        Find the most recent node object that gave rise to a given list of descendant branches.
        """
        assert len(descendants)>1,'Not enough descendants to find common ancestor: %d'%(len(descendants))
        paths_to_root={k.index: set() for k in descendants} ## for every descendant create an empty set
        for k in descendants: ## iterate through every descendant
            cur_node=k ## start descent from descendant
            while cur_node.parent: ## while not at root
                paths_to_root[k.index].add(cur_node) ## remember every node visited along the way
                cur_node=cur_node.parent ## descend

        return sorted(reduce(set.intersection,paths_to_root.values()),key=lambda k: k.height)[-1] ## return the most recent branch that is shared across all paths to root

Hi Michael,

sorry for taking so long to address this. I've updated the function a while back but forgot to let you know. Your version of the function is missing the use of the strict parameter which I intended to be a shortcut when someone wants the common ancestor of a clade defined by the descendants (i.e. return the common ancestor node if descendant tips form a clade and None otherwise). This is something for me to fix though.