/Tangled

Phylogenetic tree tangle minimization across multiple trees

Primary LanguagePythonGNU General Public License v3.0GPL-3.0

=======
Tangled
=======

About:
------

This program is the result of a Google+ post by Ross Mounce, including an 
image of a tanglegram, in which he asked:

>Q: Does anyone know of a program that can calculate tanglegrams minimizing 
>the tangle between THREE separate trees?

>I'd like to compare 'whole' trees with 'partition1' trees, with 'partition2' 
>trees. W:1:2 At the moment I can only untangle W:1 , W:2 , or 1:2 which
> means I have to untangle the third by hand. I think this is a valid use-case
> - if any devs are reading this - please think about making this happen please ;)

This program is an attempt to make this happen.

Components:
-----------

detangle.py - this is the heart of it; it builds trees and handles 'twisting them', 
i.e. rotating the branches, and uses a simulated annealing or self-tuning 
evolution process to minimize the result of a penalty/measurement function. It
has no dependencies aside from Python 2.7, but can only read NEXUS files where 
the trees are on a single line each. It produces a simple NEXUS file as output,
in which the tangles have been minimized.

detangler.py - this file has Bio.Phylo and detangle.py as dependencies, and will 
open NEXUS, Newick, and PhyloXML files, convert the trees to detangle trees and
hand them off for processing.

It also serves as an example for importing and using detangle.py in your own Bio.Phylo
based scripts.

dendropy_detangler.py - this file has DendroPy and detangle.py as dependencies,
and will open NEXUS, Newick, and PhyloXML (not sure on the last - I was not able to 
determine from the documentation whether it supported PhyloXML or not.)

It also serves as an example for importing and using detangle.py in your own
DendroPy scripts.

tangle_render.py - this file is dependent on detangle.py, Cairo, and RSVG. It uses
detangle to read files, so is limited to the format detangle.py can handle - this
should not be too much of a limitation, since it is intended to be run on the files
output by detangle.py. It reads in a result file, and renders two types of results.
output.svg is a sequence of all the ordered lists of names and their intermediate tangles,
without the tree structure, permitting an easy visual comparision of all the trees. It
attempts to re-use the right tree of a tangle as the left tree of a new tangle,
if there is an available combination left to produce from that tree, to minimize the
number of times a list is drawn. It uses itertools.combinations to produce all the 
possible combinations, ignoring order (i.e. left/right). It also produces a single 
file per combination of trees, named in the form 'tree1:tree2.svg'

Example Command Lines:
----------------------
> python detangle.py test.dat

Process test.dat and produce result.dat (by default, though there are adjustable 
settings in detangle.py itself).

> python detangler.py -o out.dat godef.tre

Process godef.tre and put the output into out.dat instead of result.dat

> python tangle_render.py result.dat

Process result.dat into output.svg and multiple tree1:tree2.svg type files.

> python detangle.py test.dat && python tangle_render.py result.dat && rsvg-view output.svg

Go all the way from input file to viewing a set of tangles (assuming rsvg-view is installed).

Licensing:
----------

Currently the files are licensed under the GPL v3, as specified in the individual files, 
and in the included LICENSE file. Email to the contact below to discuss alternative licensing.

Known Bugs and Issues:
-----------

detangle.py and tangle_render.py are hopelessly limited in their file-reading capabilities.
Hopefully this should not prove a significant obstacle, since other Python libraries can
handle that task. I did not commit to one or the other such library, opting instead to include 
sample wrappers for the libraries I have found. Feel free to point me at more such.

Contact:
--------
Howard C. Shaw III
howardcshaw@gmail.com