/QT-GILD

Official Implementation of QT-GILD

Primary LanguagePythonApache License 2.0Apache-2.0

QT-GILD

QT-GILD (version 1.0)

This repository contains the official implementation of QT-GILD: Quartet Based Gene Tree Imputation Using Deep Learning Improves Phylogenomic Analyses Despite Missing Data If you use any part of this software, please cite our paper.

Short Description

QT-GILD is a quartet imputation technique for estimating species trees despite the presence of missing data.

QT-GILD is an automated and specially tailored unsupervised deep learning technique, accompanied by cues from natural languageprocessing (NLP), which learns the quartet distribution in a given set of incomplete gene trees andgenerates a complete set of quartets accordingly.

  • Input: A set of incomplete gene trees
  • Output: The imputed quartet distribution of the gene trees

Installing QT-GILD

Before installing QT-GILD, please sure that you have the following programs installed:

  • Python: Version >= 3.7
  • Pip: Version >= 21.0
  • Java: Version >= 11.0 (if you want to generate the species trees using wQFM)

To install the python packages, use the following command

pip install -r requirements.txt

The authors recommend installing Anacoda and using seperate conda environment to install QT-GILD.

If you use wQFM, please cite the paper "wQFM: Highly Accurate Genome-scale Species Tree Estimation from Weighted Quartets".

Usage

For imputing and generating the imputed weighted quartets distribution, use -i and -o flag.

python QT-GILD.py -i <input-gene-tree-file> -o <output-folder>

OR

python QT-GILD.py --input <input-gene-tree-file> --output <output-folder>

To generate the species trees using wQFM, just use a --st flag alongside usual input.

python QT-GILD.py -i <input-gene-tree-file> -o <output-folder> --st

Example

There are two gene tree files provided in the repository to test QT-GILD

python QT-GILD.py --input test/aminota_gt.tre --output output