/ortholang

Short, reproducible phylogenomic cuts

Primary LanguagePythonGNU Affero General Public License v3.0AGPL-3.0

OrthoLang: short, reproducible phylogenomic cuts

OrthoLang is a scripting language meant to simplify a common task in bioinformatics: making a list of candidate genes or genomes related to a biological process of interest. These are sometimes called phylogenomic cuts.

It has two main design goals:

  1. Be useful for biologists with limited prior coding experience
  2. Codify the search in a format that can be published and reused by others

Run install.sh to install it on Mac or Linux.

See the demo site for a more detailed overview, tutorial, interactive examples, and reference of available functions.

Development Status

These are the only important tests if you want to try the current release:

Master branch Demo Site

But you may also be interested in progress on an upcoming feature.

Green checked boxes below have been done, grey ones are in progress, and blank ones are pending. Test badges mean the Travis tests are passing on both Mac and Linux VMs.

These are changes to the core code or build system:

branch code tests demo docs
feature-cachix ✔️ feature-cachix
feature-logging feature-logging
feature-progressbar ✔️ feature-progressbar
feature-rerun-tests feature-rerun-tests
feature-singularity feature-singularity

And these are "modules" related to a specific language feature or bioinformatics program:

branch code tests demo docs
module-allvsall ✔️ module-allvsall
module-biomartr ✔️ module-biomartr
module-blast ✔️ module-blast ✔️
module-blastdb ✔️ module-blastdb ✔️
module-blasthits ✔️ module-blasthits ✔️
module-blastrbh ✔️ module-blastrbh
module-busco ✔️ module-busco ✔️
module-cheat module-cheat
module-crbblast ✔️ module-crbblast ✔️
module-diamond ✔️ module-diamond ✔️
module-greencut module-greencut
module-hmmer ✔️ module-hmmer ✔️
module-listlike module-listlike
module-load ✔️ module-load ✔️
module-math ✔️ module-math ✔️
module-mmseqs ✔️ module-mmseqs ✔️
module-muscle ✔️ module-muscle ✔️
module-orthofinder module-orthofinder ✔️
module-orthogroups module-orthogroups
module-permute module-permute
module-plots module-plots
module-psiblast module-psiblast ✔️
module-range ✔️ module-range
module-sample ✔️ module-sample
module-scores ✔️ module-scores
module-seqio ✔️ module-seqio ✔️
module-sets ✔️ module-sets
module-setstable ✔️ module-setstable
module-sonicparanoid ✔️ module-sonicparanoid
module-summarize ✔️ module-summarize

Build Ortholang and run self-tests

OrthoLang is best built using Nix, which ensures that all dependencies are exactly satisfied. Not much human work is required, but it will download and/or build a lot of packages and store them in /nix.

First you need the package manager itself. See the website for instructions, or just run this:

curl https://nixos.org/nix/install | sh
source ~/.nix-profile/etc/profile.d/nix.sh

After you have Nix, clone this repository and run nix-build -j$(nproc) inside it. It will eventually create a symlink called result that points to the finished package.

Before using it, run the test suite to check that everything works:

./result/bin/ortholang --test

You might also want to add that to your PATH so you can call ortholang anywhere. Add this line to your ~/.bashrc.

export PATH=$PWD/result/bin:$PATH

Docker

Get the latest official image from Docker hub like so:

docker pull jefdaj/ortholang

To build a new image, edit and run dev-scripts/build-docker-image.sh.

Singularity

nix-build singularity.nix should get you most of the way there, but you should edit that file first to include any bind dirs and mount points used by your institution's HPC environment.

The resulting .img file can be run with a long command like this:

singularity run -B /path/to/your/mount/point:/path/to/your/mount/point ortholang.img

That will drop you in a shell with ortholang + all dependencies available. You'll only be able to use the host filesystem through the specified bind points. Note that your institution might automatically bind some paths. You don't need -B commands for those.

If you're using this, you may also want to write a custom wrapper script that tells OrthoLang how to run system calls using your HPC scheduler (SLURM or similar).

Try it out

These commands will run an existing script, load an existing script in the interpreter, and start a new script in the interpreter respectively:

  • ortholang --script your-existing-script.ol
  • ortholang --script your-existing-script.ol --interactive
  • ortholang

See usage.txt for other command line options, and type :help in the interpreter for a list of special : commands (things you can only do in the live interpreter).

Now you're ready to start writing your own scripts! See the demo site for everything related to that.