Reversible operations
cmungall opened this issue · 3 comments
Although the primary purpose is to generate, it's nice to have the inverse operation, parse.
As both lexical and logical axioms are simultaneously generated from a single pattern instance, there are a number of possibilities that could be performed, either with this code or in downstream client code:
- parse a label or synonym to infer pattern and slot fillers
- 'parse'/match a logical axiom (particularly equivalence) to infer pattern and slot fillers
- parse logical and non-logical axioms simultaneously to check for consistency
- gap fill: parse axioms of type T and generate missing axioms for type U
parsing is often harder than generating. However, there are a few things working in our favor:
- it looks like dosdp doesn't allow optional slots (the downside of this is some level of duplicativity in templates but I think that's OK), so this will make parsing easier
- the owlapi supposedly renders axioms in a deterministic fashion (I am not yet sure I trust it to do this)
So in theory all a parse needs to do is to generate a dummy literal or expression (eg for label or equiv axioms), serialized, and then string align these with those in the ontology. Alternatively the slot fillers could be regular expression variable binding slots, e.g. (\S+)?
to generate regexes that could be used to test constructs. If 1 and 2 hold then this should work.
Of course we expect a lot of customization of things like text definitions, but the client app could implement the necessary logic. E.g. if this was generated only by patterns and has no human signature and it differs then flag/change.
Parsing at the string level is obviously bad
@balhoff has some ideas about how to generate a SPARQL query from a DOSDP
This is what an one might look like for the OBA entity_attribute pattern:
prefix owl: <http://www.w3.org/2002/07/owl#>
prefix rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#>
prefix part_of: <http://purl.obolibrary.org/obo/BFO_0000050>
prefix inheres_in: <http://purl.obolibrary.org/obo/RO_0000052>
select distinct ?term ?attribute ?entity ?location where {
?term owl:equivalentClass ?desc .
?desc owl:intersectionOf/rdf:rest*/rdf:first ?attribute .
?desc owl:intersectionOf/rdf:rest*/rdf:first ?diff .
?desc owl:intersectionOf/rdf:rest/rdf:rest rdf:nil .
?diff owl:onProperty inheres_in: .
?diff owl:someValuesFrom ?filler .
?filler owl:intersectionOf/rdf:rest*/rdf:first ?entity .
?filler owl:intersectionOf/rdf:rest*/rdf:first ?diff2 .
?filler owl:intersectionOf/rdf:rest/rdf:rest rdf:nil .
?diff2 owl:onProperty part_of: .
?diff2 owl:someValuesFrom ?location .
FILTER(?attribute != ?diff)
FILTER(?entity != ?diff2)
}
obviously tedious to write by hand, but if we had a tool do generate the sparql we could then check the sparql into the ontology repo and plug this into the pipeline
I have code here for generating SPARQL queries for DOSDPs: https://github.com/balhoff/dosdp-scala It's a little bit preliminary so far. I added a query runner that will search a provided ontology. Using Jena the queries are pretty slow, but this may be fixable by using in-memory Blazegraph instead.
https://github.com/balhoff/dosdp-scala Does this.