INCATools/dead_simple_owl_design_patterns

Add a script to generate OWL

cmungall opened this issue · 1 comments

This subsumes my #7 request

Script will take a pattern file plus TSV or CSV. Column headings will be the names of vars in the pattern. Optionally also column headings for annotation properties.

Script will generate OWL for all classes for all values in TSV by applying the pattern. If values are specified in the annotation columns, these will overwrite the defaults.

TBD: should the script be minimal, or be a general purpose toolkit?

For example, if minimal, the script would not validate the var values (this could be done by a separate procedure, which is capable of validating whole ontology, not just newly added classes), nor would it check if the generated classes are satisfiable or equivalent to existing classes. If so, the script could potentially be pure python with no owlapi hookup. It's literally just a syntactic macro/template-style system.

I can imagine supporting two modes.

  1. Where the URIs are specified in advance in the file (an additional ID/URI column)
  2. Where the URIs are not known, and the intent is to use whatever ID generation procedure is used by the ontology to make the URIs as part of the class generation.

Tentatively I imagine supporting both, but the script being minimal and making no assumptions about ID generation. If no IDs are specified in the file, then a UUID is generated. It is up to the user to apply whatever post-processing is required to replace these with URIs

Implementation example here:
https://github.com/dosumis/dead_simple_owl_design_patterns/tree/master/src/examples
It uses some ID gen code I've worked on elsewhere.

Trivial to adapt to take a tsv table as input, but, thinking about this, I should probably do that in pattern.

The major missing elements, becuase not yet supported by pattern.py, are:

  • dbxrefs
  • synonyms (could be generated from two of the inputs in this example)