althonos/pronto

Add supports for catalogs in imports

cmungall opened this issue · 1 comments

It is common practice to include a catalog-v001.xml file that determines how imports should be resolved,

e.g:

https://github.com/INCATools/ontology-access-kit/blob/20e82c06b53e414454f50ca9d9f82468b5867f9d/tests/input/catalog-v001.xml

example input:

https://github.com/INCATools/ontology-access-kit/blob/20e82c06b53e414454f50ca9d9f82468b5867f9d/tests/input/test_import_root.obo

When running locally, the import would resolve to:

https://github.com/INCATools/ontology-access-kit/blob/20e82c06b53e414454f50ca9d9f82468b5867f9d/tests/input/test_imported_ontology.owl

Ontology("test_import_root.obo")

gives a stack trace:

_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 
/Users/cjm/Library/Caches/pypoetry/virtualenvs/oaklib-OeQZizwE-py3.9/lib/python3.9/site-packages/pronto/ontology.py:283: in __init__
    cls(self).parse_from(_handle)  # type: ignore
/Users/cjm/Library/Caches/pypoetry/virtualenvs/oaklib-OeQZizwE-py3.9/lib/python3.9/site-packages/pronto/parsers/obo.py:26: in parse_from
    self.process_imports(
/Users/cjm/Library/Caches/pypoetry/virtualenvs/oaklib-OeQZizwE-py3.9/lib/python3.9/site-packages/pronto/parsers/base.py:70: in process_imports
    return dict(pool.map(lambda i: (i, process(i)), imports))
/Library/Frameworks/Python.framework/Versions/3.9/lib/python3.9/multiprocessing/pool.py:364: in map
    return self._map_async(func, iterable, mapstar, chunksize).get()
/Library/Frameworks/Python.framework/Versions/3.9/lib/python3.9/multiprocessing/pool.py:771: in get
    raise self._value
/Library/Frameworks/Python.framework/Versions/3.9/lib/python3.9/multiprocessing/pool.py:125: in worker
    result = (True, func(*args, **kwds))
/Library/Frameworks/Python.framework/Versions/3.9/lib/python3.9/multiprocessing/pool.py:48: in mapstar
    return list(map(*args))
/Users/cjm/Library/Caches/pypoetry/virtualenvs/oaklib-OeQZizwE-py3.9/lib/python3.9/site-packages/pronto/parsers/base.py:70: in <lambda>
    return dict(pool.map(lambda i: (i, process(i)), imports))
/Users/cjm/Library/Caches/pypoetry/virtualenvs/oaklib-OeQZizwE-py3.9/lib/python3.9/site-packages/pronto/parsers/base.py:49: in process_import
    return Ontology(url, max(import_depth - 1, -1), timeout)
/Users/cjm/Library/Caches/pypoetry/virtualenvs/oaklib-OeQZizwE-py3.9/lib/python3.9/site-packages/pronto/ontology.py:283: in __init__
    cls(self).parse_from(_handle)  # type: ignore
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 

self = <pronto.parsers.rdfxml.RdfXMLParser object at 0x7f8f390bb490>
handle = <BufferedReader>, threads = None

    def parse_from(self, handle, threads=None):
        # Load the XML document into an XML Element tree
        tree: etree.ElementTree = etree.parse(handle)
    
        # Load metadata from the `owl:Ontology` element
        owl_ontology = tree.find(_NS["owl"]["Ontology"])
        if owl_ontology is None:
>           raise ValueError("could not find `owl:Ontology` element")
E           ValueError: could not find `owl:Ontology` element

/Users/cjm/Library/Caches/pypoetry/virtualenvs/oaklib-OeQZizwE-py3.9/lib/python3.9/site-packages/pronto/parsers/rdfxml.py:89: ValueError

the imported ontology can be loaded fine into pronto directly

Ideally:

  1. it should be possible to specify a dict for rewiring imports
  2. there would a reader for the catalog format that would make this dict

It should also be possible to map an ontology import to a /dev/null empty ontology. It's not atypical for an edit version of an ontology to import a local ontology in functional syntax (e.g. go-edit.obo does this for GCIs)

I concur this is a needed feature. Would also solve #62.