/Phydelity

Inference of putative transmission phylogenetic clusters

Primary LanguageJupyter NotebookGNU Lesser General Public License v3.0LGPL-3.0

Phydelity

Inferring putative transmission clusters from phylogenetic trees

Latest updates

  • 25-Jul-2019: V2.1 - Fixed int64 type declaration for cross-platform compatibility; tested on Windows and Mac
  • 10-May-2019: V2.0 - Improved algorithm yielding clusters with higher purity and lower probability of misclassification (see Manuscript for more details).
  • 6-Dec-2018: Fixed bug which was overly strict when cleaning up clusters that violated within-cluster limit.

Overview

Phydelity, a redesign of PhyCLIP, is a statistically-principled and phylogeny-informed tool capable of identifying putative transmission clusters in pathogen phylogenies without the introduction of arbitrary distance thresholds.

Minimally, Phydelity only requires a phylogeny (in NEWICK format) as input.

Phydelity infers the within-cluster divergence of putative transmission clusters by first determining the pairwise patristic distance distribution of closely-related tips. This distance distribution comprises of the pairwise distances of sequence j and its closest k-neighbouring tips, where the closest k-neighbours includes sequence j. The user can optionally input the desired k parameter OR allow Phydelity to automatically scale k to yield the supremum distribution with the lowest overall divergence.

To cite Phydelity:
Alvin X Han, Edyth Parker, Sebastian Maurer-Stroh, Colin A Russell, Inferring putative transmission clusters with Phydelity, Virus Evolution, Volume 5, Issue 2, July 2019, vez039, https://doi.org/10.1093/ve/vez039

Quickstart (for users of an academic institution only)

  1. Installation
  • Install dependencies using Anaconda.
    • Phydelity is written in Python 2 (Users using Python 3 for base Conda environment can build a separate Python 2 environment.
$ conda install -c etetoolkit ete3
$ conda install -c anaconda cython
$ conda install numpy scipy 
$ conda config --add channels http://conda.anaconda.org/gurobi
$ conda install gurobi
$ cd /path/to/Phydelity-master/
$ python setup.py install 
$ python setup.py clean --all 
  1. Run Phydelity
    Minimal input command:
$ phydelity.py --tree </path/to/treefile.newick>

See Full options below for other analysis options.

  1. Outputs
  • cluster_phydelity_k<\d+>_sol<\d+>_.txt - Tab-delimited text file of tip names and corresponding cluster ID.
  • tree_phydelity_k<\d+>_sol<\d+>_.txt - Figtree-formatted NEXUS tree file with cluster annotations.
  • pdftree_phydelity_k<\d+>_sol<\d+>_.txt - Optional PDF tree file if --pdf_tree is called.

Full options

usage: phydelity.py [-h] [-t TREE] [--k K] [--outgroup OUTGROUP]
                    [--collapse_zero_branch_length]
                    [--equivalent_zero_length EQUIVALENT_ZERO_LENGTH]
                    [--solver_verbose {0,1}] [--solver_check] [--pdf_tree]

Phydelity v1.0

optional arguments:
  -h, --help            show this help message and exit

Required:
  -t TREE, --tree TREE  Input phylogenetic tree in NEWICK format.

Analysis options:
  --k K                 Custom k neighbours (optional).
  --outgroup OUTGROUP   Taxon (name as appeared in tree) to be set as outgroup
                        OR type 'midpoint' for midpoint-rooting.
  --collapse_zero_branch_length
                        Collapse internal nodes with zero branch length of
                        tree before running Phydelity.
  --equivalent_zero_length EQUIVALENT_ZERO_LENGTH
                        Maximum branch length to be rounded to zero if the
                        --collapse_zero_branch_length flag is passed (default
                        = 1e-06).

Solver options:
  --solver_verbose {0,1}
                        Gurobi solver verbose (default: 0)
  --solver_check        Check if Gurobi is installed.

Output options:
  --pdf_tree            PDF tree output annotated with cluster results (X
                        server required).