/orthofisher

a broadly applicable tool for automated gene identification and retrieval

Primary LanguagePythonMIT LicenseMIT

Logo

Docs · Report Bug · Request Feature



Orthofisher conducts automated and high-throughout identification of a predetermined set of orthologs, which can be used for phylgenomics, gene family copy number determination and more!

If you found orthofisher useful, please cite orthofisher: a broadly applicable tool for automated gene identification and retrieval. Steenwyk & Rokas 2021, G3 Genes|Genomes|Genetics. doi: 10.1093/g3journal/jkab250.



Guide

Quick Start
Performance Assessment
FAQ




Quick Start

For detailed instructions on usage and a tutorial, please see the online documentation.


1) Prerequisite

Before installing orthofisher, please first install HMMER3 and add the HMMER to your .bashrc path. For example, my .bashrc has the following:

export PATH=$PATH:/home/steenwj/SOFTWARE/hmmer-3.1b2-linux-intel-x86_64/binaries

2) Install orthofisher

If you are having trouble installing orthofisher, please contact the lead developer, Jacob L. Steenwyk, via email or twitter to get help.


To install via anaconda, execute the follwoing command:

conda install -c jlsteenwyk orthofisher

Visit here for more information: https://anaconda.org/jlsteenwyk/orthofisher


To install via pip, execute the follwoing command:

pip install orthofisher

To install from source, execute the follwoing command:

# download
git clone https://github.com/JLSteenwyk/orthofisher.git
# change dir
cd orthofisher/
# install
make install

If you run into software dependency issues, install orthofisher in a virtual environment. To do so, create your virtual environment with the following command:

# create virtual environment
python -m venv .venv
# activate virtual environment
source .venv/bin/activate

Next, install the software using your preferred method above. Thereafter, you will be able to use orthofisher.

To deactivate your virtual environment, use the following command:

# deactivate virtual environment
deactivate

Note, the virtual environment must be activated to use orthofisher.




Performance Assessment

Using 1,530 sequence similarity searches across six model eukaryotic proteomes, the performance of orthofisher was compared to results obtained from BUSCO. Examination of precision and recall revealed near perfect performance. More specifically, orthofisher had a recall of 1.0 and precision of 0.99. Precision is less than 1.0 because priors of expected sequence length and sequence similarity scores--which are not implemented in orthofisher--resulted in more missing genes in the BUSCO pipeline than the orthofisher pipeline.




FAQ


I am having trouble installing orthofisher, what should I do?

Please install orthofisher using a virtual environment as described in the installation instructions. If you are still running into issues after installing in a virtual environment, please contact Jacob L. Steenwyk via email or twitter.




orthofisher is developed and maintained by Jacob Steenwyk