/unite-train

🍄 Qiime2 ITS classifiers for the UNITE database

Primary LanguageHTMLBSD 3-Clause "New" or "Revised" LicenseBSD-3-Clause

unite-train

A pipeline to build Qiime2 taxonomy classifiers for the UNITE database.

Issues release Downloads


Running Snakemake workflow

Set up:

  • Install Mambaforge and configure Bioconda.
  • Install the version of Qiime2 you want using the recomended environment name. (For a faster install, you can replace conda with mamba.)
  • Install Snakemake into an environment, then activate that environment.

Configure:

  • Open up config/config.yaml and configure it to your liking. (For example, you may need to update the name of your Qiime2 environment.)

Run:

snakemake --cores 8 --use-conda --resources mem_mb=10000

Training one classifier takes 1-9 hours on an AMD EPYC 75F3 Milan, depending on the size and complexity of the data.

Run on a slurm cluster:

More specifically, The University of Florida HiPerGator supercomputer, with access generously provided by the Kawahara Lab!

screen    # We connect to a random login node, so we may not be able...
screen -r # to reconnect with this later on.

snakemake --jobs 24 --slurm \
  --rerun-incomplete --retries 3 \
  --use-envmodules --latency-wait 10 \
  --default-resources slurm_account=kawahara slurm_partition=hpg-milan
Run with Docker:

Say, in 'the cloud' using FlowDeploy.

snakemake --jobs 12 \
  --rerun-incomplete --retries 3 \
  --use-singularity \
  --default-resources

Reports:

snakemake --report results/report.html
snakemake --forceall --dag --dryrun | dot -Tpdf > results/dag.pdf