YuLab-SMU/tidytree

as.treedata fails to parse an appropriately-structured tibble

tjcreedy opened this issue · 0 comments

I'm plotting a phylo object with ggtree to take advantage of the ability to integrate lots of node metadata. I do this by conversion of the phylo object to a tibble, followed by various dplyr and purrr functions, then conversion back using as.treedata. When the tibble is modified directly, no issues are encountered and the tree structure is retained. However, if the tibble is converted to a list and back as in the purrr::modify_at example below and despite the tibble still having all the necessary column headings, as.treedata fails to correctly parse it, losing the branch lengths and misreading the node and tip labels.

library(purrr)
library(tidytree)

phy <- rtree(20, br = runif)

nodedata <- tibble(node = 21:39, nodeinfo = letters[1:19])
tipdata <- tibble(node = 1:20,  tipinfo = LETTERS[1:20])

phytbl <- as_tibble(phy) %>% 
  left_join(nodedata, by = "node") %>%
  left_join(tipdata, by = "node") %>%
  mutate(label = paste(label, tipinfo))

as.treedata(phytbl)@phylo # Exactly as expected

phytbl <- 
  phytbl %>%
  mutate(branchinfo = as.character(NA)) %>%
  group_split(branch.length > 0.6) %>%
  modify_at(2, ~ mutate(., branchinfo = "info")) %>%
  bind_rows()

as.treedata(phytbl)@phylo # Branch lengths, node and tip labels all lost

This appears to be due to the phytbl object losing the tbl_tree class during the modification, and if added back as.treedata will work as expected - however IMO this isn't readily apparent and took me some time and poking into the source code to figure out.