Find for your input taxonomy the taxonomy from an other database, currently GBIF and The Dutch species register.
Clone this repo in your Galaxy Tools directory:
git clone https://github.com/naturalis/galaxy-tool-taxonmatcher
Make the python script executable:
chmod 755 galaxy-tool-taxonmatcher/taxonmatcher.sh
chmod 755 galaxy-tool-taxonmatcher/taxonmatcher.py
Append the file tool_conf.xml:
<tool file="/path/to/Tools/galaxy-tool-taxonmatcher/taxonmatcher.sh" />
Depending on your setup the ansible.builtin.git module could be used.
Install the tool
by including the following in your dedicated *.yml file:
- repo: https://github.com/naturalis/galaxy-tool-taxonmatcher
file: taxonmatcher.xml
version: master
The instructions above assume python3 and gcc compiler are installed.
The steps below shoud be executed from the galaxy-tool-taxonmatcher folder
in your Galaxy Tools directory.
Download the taxonomy backbone
wget https://hosted-datasets.gbif.org/datasets/backbone/current/backbone.zip
unzip
unzip -p backbone.zip Taxon.tsv > Taxon.tsv
Taxon.tsv should be in path/to/galaxy-tool-taxonmatcher/
Create the database (currently the path to Taxon.tsv is hardcoded)
python3 utilities/make_gbif_database.py
The output file is gbif_taxonmatcher
Download the taxonomy backbone
wget http://api.biodiversitydata.nl/v2/taxon/dwca/getDataSet/nsr
unzip
unzip -p nsr Taxa.txt > Taxa.txt
Taxa.txt should be in path/to/galaxy-tool-taxonmatcher/
Create the database (currently the path to Taxon.txt is hardcoded)
python3 utilities/make_nsr_database.py
The output file is nsr_taxonmatcher
Move the database files (gbif_taxonmatcher and nsr_taxonmatcher) to the desired location
(in our case: /data/blast_databases/taxonomy/). Make sure the path in taxonmatcher.sh
corresponds to this location.