/pronto

A Python frontend to (Open Biomedical) Ontologies.

Primary LanguagePythonMIT LicenseMIT

pronto Stars

A Python frontend to ontologies.

Actions License Source Docs Coverage Sanity PyPI Bioconda Versions Wheel Changelog GitHub issues DOI Downloads

🚩 Table of Contents

🗺️ Overview

Pronto is a Python library to parse, browse, create, and export ontologies, supporting several ontology languages and formats. It implement the specifications of the Open Biomedical Ontologies 1.4 in the form of an safe high-level interface. If you're only interested in parsing OBO or OBO Graphs document, you may wish to consider fastobo instead.

🏳️ Supported Languages

🔧 Installing

Installing with pip is the easiest:

# pip install pronto          # if you have the admin rights
$ pip install pronto --user   # install it in a user-site directory

There is also a conda recipe in the bioconda channel:

$ conda install -c bioconda pronto

Finally, a development version can be installed from GitHub using setuptools, provided you have the right dependencies installed already:

$ git clone https://github.com/althonos/pronto
$ cd pronto
# python setup.py install

💡 Examples

If you're only reading ontologies, you'll only use the Ontology class, which is the main entry point.

>>> from pronto import Ontology

It can be instantiated from a path to an ontology in one of the supported formats, even if the file is compressed:

>>> go = Ontology("tests/data/go.obo.gz")

Loading a file from a persistent URL is also supported, although you may also want to use the Ontology.from_obo_library method if you're using persistent URLs a lot:

>>> cl = Ontology("http://purl.obolibrary.org/obo/cl.obo")
>>> stato = Ontology.from_obo_library("stato.owl")

🏷️ Get a term by accession

Ontology objects can be used as mappings to access any entity they contain from their identifier in compact form:

>>> cl['CL:0002116']
Term('CL:0002116', name='B220-low CD38-positive unswitched memory B cell')

Note that when loading an OWL ontology, URIs will be compacted to CURIEs whenever possible:

>>> aeo = Ontology.from_obo_library("aeo.owl")
>>> aeo["AEO:0000078"]
Term('AEO:0000078', name='lumen of tube')

🖊️ Create a new term from scratch

We can load an ontology, and edit it locally. Here, we add a new protein class to the Protein Ontology.

>>> pr = Ontology.from_obo_library("pr.obo")
>>> brh = ms.create_term("PR:XXXXXXXX")
>>> brh.name = "Bacteriorhodopsin"
>>> brh.superclasses().add(pr["PR:000001094"])  # is a rhodopsin-like G-protein
>>> brh.disjoint_from.add(pr["PR:000036194"])   # disjoint from eukaryotic proteins

✏️ Convert an OWL ontology to OBO format

The Ontology.dump method can be used to serialize an ontology to any of the supported formats (currently OBO and OBO JSON):

>>> edam = Ontology("http://edamontology.org/EDAM.owl")
>>> with open("edam.obo", "wb") as f:
...     edam.dump(f, format="obo")

🌿 Find ontology terms without subclasses

The terms method of Ontology instances can be used to iterate over all the terms in the ontology (including the ones that are imported). We can then use the is_leaf method of Term objects to check is the term is a leaf in the class inclusion graph.

>>> ms = Ontology("ms.obo")
>>> for term in ms.terms():
...     if term.is_leaf():
...         print(term.id)
MS:0000000
MS:1000001
...

🤫 Silence warnings

pronto is explicit about the parts of the code that are doing non-standard assumptions, or missing capabilities to handle certain constructs. It does so by raising warnings with the warnings module, which can get quite verbose.

If you are fine with the inconsistencies, you can manually disable warning reports in your consumer code with the filterwarnings function:

import warnings
import pronto
warnings.filterwarnings("ignore", category=pronto.warnings.ProntoWarning)

📖 API Reference

A complete API reference can be found in the online documentation, or directly from the command line using pydoc:

$ pydoc pronto.Ontology

📜 License

This library is provided under the open-source MIT license. Please cite this library if you are using it in a scientific context using the following DOI: 10.5281/zenodo.595572