/phylo2vec_preprint

Phylo2Vec: A vector representation of binary trees

Primary LanguageJupyter Notebook

Phylo2Vec: a vector representation for binary trees

This repository contains an implementation of Phylo2Vec which includes:

  • cfg/: Example configuration files
  • data/: Placeholder folder to contain sequence files in FASTA format.
  • examples/: Example notebooks for different datasets
  • hc/: Phylogenetic tree optimisation via hill-climbing optimisation
    • Branch length and nucleotide subsitution model optimisation relies on RAxML-NG
  • tests/: Placeholder folder for unit tests
  • trees/: Placeholder folder to contain tree files as Newick strings.
  • utils/: Utility functions including definitions of Phylo2Vec and transforms from commonly used tree formats to Phylo2Vec (and vice versa).

Demo

A quick demo detailing hill-climbing optimisation with Phylo2Vec is available on the demo.ipynb notebook.

A more minimalistic demo with an updated defiition of Phylo2Vec is available on Colab: Open In Colab

Environment setup

To reproduce the environment, run:

conda env create -f env.yml

Run hill climbing-based optimisation using Phylo2Vec

To run hill climbing-based optimisation using Phylo2Vec, run:

conda activate phylo

python -m hc.main

Other third-party software

Accessing data

The following datasets were used:

Future work

As mentioned in the submission, we plan to add more optimiation schemes using Phylo2Vec, e.g., MCTS or gradient descent.

See https://github.com/Neclow/gradme