/ontology-development-kit

Bootstrap an OBO Library ontology

Primary LanguageJinjaBSD 3-Clause "New" or "Revised" LicenseBSD-3-Clause

Build the ODK images and run the tests DOI DOI

https://www.wikidata.org/wiki/Q112336713

The Ontology Development Kit (ODK)

Manage your ontology's life cycle with the Ontology Development Kit (ODK)! The ODK is

  • a toolbox of various ontology related tools such as ROBOT, owltools, dosdp-tools and many more, bundled as a docker image
  • a set of executable workflows for managing your ontology's continuous integration, quality control, releases and dynamic imports

For more details, see

Where to get help

Steering Committee

  • @gouttegd Damien Goutte-Gattat (ODK Lead, FlyBase)
  • @matentzn Nicolas Matentzoglu (ODK Deputy, Semanticly)
  • @cmungall Chris Mungall (ODK Founder, LBNL)

Core team

  • @anitacaron Anita Caron (Novo Nordisk)
  • @balhoff Jim Balhoff (RENCI)
  • @dosumis David Osumi-Sutherland (Sanger)
  • @ehartley Emily Hartley (Critical Path Institute)
  • @hkir-dev Huseyin Kir (EMBL-EBI)
  • @shawntanzk Shawn Tan (Novo Nordisk)
  • @ubyndr Ismail Ugur Bayindir (EMBL-EBI)

Full list of contributors: https://github.com/INCATools/ontology-development-kit/graphs/contributors

Cite

https://doi.org/10.1093/database/baac087

Outstanding contributions

Outstanding contributors are groups and institutions that have helped with organising the ODK development, providing funding, advice and infrastructure. We are very grateful for all your contributions - the project would not exist without you!

Monarch Initiative

The Monarch Initiative is a consortium of medical, biological and computational experts that provide major ontology services such as the Human Phenotype Ontology, Mondo and an integrative data and analytic platform connecting phenotypes to genotypes across species, bridging basic and applied research with semantics-based analysis.

https://monarchinitiative.org/

European Bioinformatics Institute

The Samples, Phenotypes and Ontologies (SPOT) team, led by Helen Parkinson, is concerned with high throughput mammalian phenotyping, Semantics as a Service and human genetics resources. Members of the SPOT team including David Osumi-Sutherland have made major contributions to ODK, and provided advice, use cases and funding.

https://www.ebi.ac.uk/spot/

University of Florida Biomedical Informatics Program

https://hobi.med.ufl.edu/research-2/biomedical-informatics-3/

Knocean Inc.

Knocean Inc. offers consulting and development services for science informatics, in particular in the area of biomedical ontologies and ontology tooling.

http://knocean.com/

Critical Path Institute

The Critical Path For Alzheimer’s Disease (CPAD) is a public-private partnership aimed at creating new tools and methods that can be applied to increase the efficiency of the development process of new treatments for Alzheimer disease (AD) and related neurodegenerative disorders with impaired cognition and function.

https://c-path.org/

Requirements

Docker

Using the ODK docker image requires Docker Engine version 20.10.8 or greater for v1.3.1.

Tips and Tricks

Customizing your ODK installation

You will likely want to customize the build process, and of course to edit the ontology.

We recommend that you do not edit the main Makefile, but instead the supplemental one (e.g. myont.Makefile) is src/ontology

An example of how you can customise your imports for example is documented here

Migrating an existing ontology repo to the ODK

The ODK is designed for creating a new repo for a new ontology. It can also be used to help figure out how to migrate an existing git repository to the ODK structure. There are different ways to do this.

  • Manually compare your ontology against the template folder and make necessary adjustments
  • Run the seed script as if creating a new repo. Manually compare this with your existing repo and use git mv to rearrange, and adding any missing files by copying them across and doing a git add
  • Create a new repo de novo and abandon your existing one, using, for example, github issue mover to move tickets across.

Obviously the second method is not ideal as you lose your git history. Note even with git mv history tracking becomes harder.

If you have built your ontology using a previous version of ODK, migration of your setup is unfortunately a manual process. In general you do not absolutely need to upgrade your setup, but doing so will bring advantages in terms of aligning with emerging standards ways of doing things. The less customization you do on your repo the easier it should be to migrate.

Consult the CHANGELOG.md file for changes made between releases to assist in upgrading.

More documentation

You will find additional documentation in the src/ontology/README-editors.md file in your repo.

The ODK also comes with built in options to generate your own shiny documentation; see for example the PATO documentation here which is almost entirely autogenerated from the ODK.

Alternative to Docker

You can run the seed script without docker using Python3.6 or higher and Java. See requirements.txt for python requirements.

This is, however, not recommended.

Running OBO dashboard with ODK

Note: this is an highly experimental feature as of ODK version 1.2.24. Note that the display and the scores are under active development and will change considerably in the near future.

Example implementation:

  1. An ODK container wrapper (called odk.sh in the following), similar to the run.sh file in your typical repos src/ontology directory.
  2. A dashboard config YAML file (called dashboard-config.yml in the following)

With both files, you can then create a dashboard using the following command:

sh odk.sh obodash -C dashboard-config.yml

The wrapper (odk.sh) should contain something like the following:

#!/bin/sh
# Wrapper script for ODK docker container.
#
docker run -e ROBOT_JAVA_ARGS='-Xmx4G' -e JAVA_OPTS='-Xmx4G' \
  -v $PWD/dashboard:/tools/OBO-Dashboard/dashboard \
  -v $PWD/dashboard-config.yml:/tools/OBO-Dashboard/dashboard-config.yml \
  -v $PWD/ontologies:/tools/OBO-Dashboard/build/ontologies \
  -v $PWD/sparql:/tools/OBO-Dashboard/sparql \
  -w /work --rm -ti obolibrary/odkfull "$@"

Note that this essentially binds a few local directories to the running ODK container. The directories serve the following purposes:

  1. dashboard: this is where the dashboard is deposited. Look at index.html in your browser.
  2. ontologies: this is where ontologies are downloaded to and synced up
  3. sparql: an optional directory that allows you to add custom checks on top of the usual OBO profile.

This is a minimal example dashboard config for a potential phenotype dashboard:

title: OBO Phenotype Dashboard
description: Quality control for OBO phenotype ontologies. Under construction.
ontologies:
  custom:
    - id: wbphenotype
    - id: dpo
      base_ns:
        - http://purl.obolibrary.org/obo/FBcv
environment:
  ROBOT_JAR: /tools/robot.jar
  ROBOT: robot

The ontologies will, if they exist, be retrieved from their OBO purls and evaluated. There are more options potentially of interest:

title: OBO Phenotype Dashboard
description: Quality control for OBO phenotype ontologies. Under construction.
ontologies:
  custom:
    - id: myont
      mirror_from: https://raw.githubusercontent.com/obophenotype/c-elegans-phenotype-ontology/master/wbphenotype-base.owl
    - id: dpo
      base_ns:
        - http://purl.obolibrary.org/obo/FBcv
prefer_base: True
profile:
  baseprofile: "https://raw.githubusercontent.com/ontodev/robot/master/robot-core/src/main/resources/report_profile.txt"
  custom:
    - "WARN\tfile:./sparql/missing_xrefs.sparql"
report_truncation_limit: 300
redownload_after_hours: 2
environment:
  ROBOT_JAR: /tools/robot.jar
  ROBOT: robot
  • mirror_from allows specifying a download URL other than the default OBO purl
  • base_ns allows specifying the set of namespaces considered to be owned by the ontology (only terms in these namespaces will be evaluated for this ontology. Default is http://purl.obolibrary.org/obo/CAPTIALISEDONTOLOGYID).
  • report_truncation_limit allows truncating long (sometimes HUGE ontology reports) to make them go easier on GITHUB version control.
  • redownload_after_hours: this allows to specify how long to wait before trying to download an ontology (which could be a time consuming process!) again.
  • environment: is currently a necessary parameter but will be made optional in future versions. It allows adding environment variables directly to the config, rather than passing them in as -e parameters to the docker container (both are equivalent though.)
  • profile is an optional parameter that allows specifying your own profile for the quality control (ROBOT) report. By default, this is using the ROBOT report default profile. You can either specify your own profile from scratch, or extend the current default with additional test by using the baseprofile parameter. Find out more about ROBOT profiles here.

A fully working example can be found here.