Export back to NextStrain JSON
Closed this issue · 3 comments
I am working on a project where I am trying to add some additional information to the leaves of a Nextstrain JSON tree. Your package is amazing for parsing that format, but I am having a hard time understanding how to get the export back to JSON format from baltic.baltic.tree object
format. Is there a function or method for that?
Hi Mike,
This is something that @sidneymbell and I were talking about a few days ago. There's currently nothing in place within the repo itself, but I've written this snippet to do it semi-manually:
from datetime import datetime as dt
import json
nexus_tree=bt.loadNexus('/mnt/c/Users/evogytis/Downloads/aln_03_0816.mcc.tree')
nexus_tree.treeStats()
def convertToJSON(node,index,most_recent_tip):
json_node={'name': None,
'node_attrs': {'num_date': {'value': node.absoluteTime}},
'branch_attrs': {}
}
if 'height_95%_HPD' in node.traits: ## height 95% HPD available, compute from most recent tip date
lower,upper=node.traits['height_95%_HPD']
time_range = [most_recent_tip-upper, most_recent_tip-lower]
json_node['node_attrs']['num_date']['confidence']=time_range
if node.branchType=='node': ## node
json_node['children']=[] ## has children
json_node['name']='NODE_%07d'%(index) ## different name
for child in node.children: ## iterate over children
if child.branchType=='node': index+=1 ## increment index if child is node too
index,json_child=convertToJSON(child,index,most_recent_tip) ## get the json-formatted child
json_node['children'].append(json_child) ## attach resulting json-formatted children to current json node
else:
json_node['node_attrs']['country']={'value': node.name.split('|')[2], 'confidence': {node.name.split('|')[2]: 1.0}}
json_node['node_attrs']['location']={'value': node.name.split('|')[3], 'confidence': {node.name.split('|')[3]: 1.0}}
json_node['name']=node.name.split('|')[0] ## leaf, name is simple
return index,json_node
def toNextstrainJSON(tree,output):
out_file=open(output,'w')
_,json_tree=convertToJSON(tree.root,0,tree.mostRecent)
output_json={'version': 'v2',
'meta': {'updated': '%s'%(dt.strftime(dt.now(),'%Y-%m-%d')),
'colorings': [{'key': 'country', 'title': 'Country', 'type': 'categorical'},
{'key': 'location', 'title': 'Location', 'type': 'categorical'}],
'panels': ['tree'],
'display_defaults': {'color_by': 'country',
'distance_measure': 'num_date',
'geo_resolution': 'country',
'map_triplicate': 'true'},
'filters': ['country','location']
},
'tree': json_tree}
json.dump(output_json,out_file,indent=1)
out_file.close()
out='/mnt/c/Users/evogytis/Downloads/aln_03_816.json'
toNextstrainJSON(nexus_tree,out)
Hopefully it's clear enough to adapt to your own case but if not - let me know. I intend to include some way of exporting auspice JSON files in the future but it's not a priority at the moment.
@evogytis Amazing! Thanks much for this code snippet, and prompt reply.
To give a bit more background: we are trying to visualize mutations for each sample on the tree. We have the back end for the visualization worked out, but need total mutations as a key:value
for each leaf to provide the data to drive things. I have the tree traversal worked out with the newick formatted tree and muts_nt.json
. I am trying to find an easy way to make the last step to add that information back to the Auspice JSON. So, I am going from Nextstrain JSON, modifying each leaf, and then back to Nextstrain. I was having a terrible time parsing the JSON format until I hit upon baltic. This is an amazing set of tools!
So, I think my case might be easier than even the snippet you provided. I should be able to work from your example. I will close for now, and if I hit an issue I will re-open the ticket.
In case anyone comes across this issue and needs nextstrain to nextstrain. Here is the code snippet I modified from above.
import baltic as bt
import json
def convertToJSON(node,index,most_recent_tip):
json_node={'name': None,
'node_attrs': {'num_date': {'value': node.absoluteTime}}
}
if node.branchType=='node': ## node
json_node['children']=[] ## has children
json_node['name']=node.name ## different name
json_node['node_attrs']=node.traits['node_attrs']
try:
json_node['branch_attrs']=node.traits['branch_attrs']
except:
pass
for child in node.children: ## iterate over children
if child.branchType=='node': index+=1 ## increment index if child is node too
index,json_child=convertToJSON(child,index,most_recent_tip) ## get the json-formatted child
json_node['children'].append(json_child) ## attach resulting json-formatted children to current json node
else:
json_node['node_attrs']=node.traits['node_attrs']
json_node['branch_attrs']=node.traits['branch_attrs']
json_node['name']=node.name
# make a change to specific leaf node here.
return index,json_node
def toNextstrainJSON(tree,meta,output):
out_file=open(output,'w')
_,json_tree=convertToJSON(tree.root,0,tree.mostRecent)
output_json={'meta': meta,
'tree': json_tree,
'version': 'v2'} #
json.dump(output_json,out_file,indent=1)
out_file.close()
nextstrainPath='<PATH>/ncov_example.json'
myTree, myMeta = bt.loadJSON(nextstrainPath)
out='auspice_modified_output.json'
toNextstrainJSON(myTree,myMeta,out)