Correctly load newick node annotations containing nested lists
Opened this issue · 0 comments
Thanks for the great tool!
We have some trees from BEAST in extended newick format, and each of their nodes has an annotation containing a nested list. Here's a minimal example of a tree containing a single node, with the sort of annotations I'm talking about:
'1:[&rate=1.0,mutations=3.0,history_all={{57,0.08,C,T},{134,0.079,A,G},{4,0.07,C,T}}]1;'
Dendropy loads this tree without errors, but parses the value of the history_all
field incorrectly:
>>> t = dendropy.Tree.get(data='1:[&rate=1.0,mutations=3.0,history_all={{57,0.08,C,T},{134,0.079,A,G},{4,0.07,C,T}}]1;', schema='newick')
>>> t.seed_node.annotations.get_value('history_all')
['{57', '0.08', 'C', 'T']
As you can see, only the first sublist is parsed (up until the first closing }
), and the first item contains the opening bracket of the first sublist.
I know I can pass the Tree.get
method the keyword argument extract_comment_metadata=False
, and parse the resulting node.comments
string myself. That's a nice workaround, but I'm wondering if there's a way I'm not seeing to provide a custom annotation string parser, or if there would be some other easy fix for this behavior?
cc @matsen