This is experimental and was created for a specific project at UR, and is thus a custom solution for a custom problem. You may, however, find some of this useful for your biodiversity projects. If so, make sure you back up your Specify database and use the synonymization tool at your own risk!
python 3
pip 3
pipenv install
Or
pip3 install pygbif mysql-connector
To remove occurrence data and fetch all synonyms for species queried from GBIF:
-
Download some species data from GBIF as a csv
-
Run the fetch tool - this will likely run for a long time. To stop it, type ctrl+c:
python3 specify_csv.py -i species_data.csv -o synonyms.csv
- Import the resulting synonyms.csv file into Specify
The mappings are as follows (in the order they're displayed in Specify):
Data Set Columns | Specify Taxon Import Field |
---|---|
taxonID | Species GUID |
genus | Genus |
class | Class |
authorship | Species Author |
order | Order |
kingdom | Kingdom |
family | Family |
phylum | Phylum |
name | Species |
source | Species Source |
If you want to use the synonymization tool you need to map the taxonID field to the Species GUID field in Specify
To synonymize imported GBIF data in Specify:
- Run the synonymization tool with an optional dry run, which will produce two csv reports (specify_accepted_report.csv and specify_synonyms_report.csv) that show which Specify records will be set to preferred names and which will become synonyms, respectively.
The first time you run this tool, you will be prompted to enter the following Specify server credentials:
database name
database username
database password
database hostname
This will produce a specify_config.json file that you can use to connect to Specify without re-entering your credentials.
In order for this to work, you need to use the same synonyms.csv file you used to import your data into Specify
python3 specify_synonymize.py -i synonyms.csv -d
The above command will create a config file and produce specify_accepted_report.csv and specify_synonyms_report.csv
- Run the synonymization tool and update the Specify records outlined in the reports:
python3 specify_synonymize.py -i synonyms.csv -c specify_config.json