/sts

Sequential Tree Sampler for online phylogenetics

Primary LanguageC++GNU General Public License v3.0GPL-3.0

Sequential Tree Sampler (STS)

The sequential tree sampler implements a prototype of online phylogenetics, updating a posterior distribution generated by MrBayes with new sequences. The algorithm has been described and its performance evaluated in a manuscript. Also available as a preprint. The scripts used to generate the figures can be found here.

Dependencies

  • smctc - included as git submodule (git submodule update --init)
  • lcfit - included as git submodule (git submodule update --init)
  • Bio++ version 2.2.0 core, seq, and phyl modules. Note that debian & ubuntu up to 16.04 include v2.1.0 which is too old. Bio++ should be installed from source using the bpp-setup.sh script on these systems. Alternatively, the source code of version 2.3.0 for each module can be dowloaded from github in the releases section
  • cmake
  • gsl version 1 or 2
  • nlopt
  • boost
  • beagle version 2.1 (Optional but recommended)
  • google test this is libgtest on debian/ubuntu (Optional)

Compiling

  1. Install dependencies
  2. run make

Binaries will be build in _build/release

Adding taxa to an existing posterior

The tool sts-online adds taxa to an existing posterior tree sample. sts-online operates on a fasta file and tree file in nexus format. The fasta file must contain an alignment with a superset of the taxa in the tree file.

DNA substitution models: Jukes-Cantor (JC69), generalised time reversible (GTR), Hasegawa Kishino Yano (HKY), Kimura (K80). Protein substitution models: JTT, WAG, LG.

Example invocation with JC69

sts-online -b 250 -p 2 --proposal-method lcfit 10taxon-01.fasta 10tax_trim_t1.t 10tax_trim_t1.sts.json

In this example, we use an alignment containing 10 sequences and a posterior sample of trees generated by MrBayes with an alignment that does not contain the sequence labeled t1. sts-online ignores the first 250 trees from 10tax_trim.t1.t and uses a particle factor of 2. The 10tax_trim_t1.sts.json file will contain the updated trees.

Example invocation with GTR

sts-online -b 250 -p 2 --proposal-method lcfit 10taxon-01.fasta 10tax_trim_t1.t 10tax_trim_t1.sts.json -P 10tax_trim_t1.p -M GTR -o 10tax

In this example, we use again an alignment containing 10 sequences and a posterior sample of trees generated by MrBayes under the GTR model with an alignment that does not contain the sequence labeled t1. sts-online ignores the first 250 trees from 10tax_trim_t1.t and parameters from 10tax_trim_t1.p and uses a particle factor of 2. The 10tax_trim_t1.sts.json file will contain the updated trees and parameters in the json format. Usin the option -o, two additional files 10tax.log and 10tax.trees containing parameters (csv file) and trees (nexus file) will also be created.