ucoProject/UCO

UCO usage of upstream design vocabularies needs a typo checker

ajnelson-nist opened this issue · 1 comments

Background

UCO uses four concept namespaces to define its ontology:

  • RDF - http://www.w3.org/1999/02/22-rdf-syntax-ns#
  • RDFS - http://www.w3.org/2000/01/rdf-schema#
  • OWL - http://www.w3.org/2002/07/owl#
  • SHACL - http://www.w3.org/ns/shacl#

Each of these namespaces has a fixed set of members, such as owl:Class, with defined semantics. The members have IRIs that should be treated as case-sensitive when programming, due to some programming languages' case sensitivity.

Unfortunately, some of these concepts are easy to spell incorrectly, or misinterpret as existing or not, and this can have consequences for some RDF systems.

  • Testing for Issue 449 encountered a curious issue where Protégé failed to load, logging Cause: value cannot be null at this stage. The ultimate cause of this failure to load was a typo'd property, owl:AnnotatedProperty (and similar), which should have used a lowercase a at the start of the concept-fragment (owl:annotatedProperty).
  • A UCO concept was typo'd as an rdfs:DataType, which excludes it from being treated as a rdfs:Datatype.
  • A concept thought to be in OWL, and triggering reports of errors when reviewing other ontologies, has turned out to not exist. owl:ontologyIRI is not an IRI within the OWL namespace. The informal concept of "ontologyIRI" is referenced significantly in the OWL syntax document, but is not defined in the OWL namespace file. "Ontology IRI" instead refers to x in triples of the form x a owl:Ontology, and discussions about it being defined or not pertain to whether the subject is a blank node or not.

A mechanism exists in rdflib that can report when a non-member is referenced in a "Closed namespace". UCO should adopt that mechanism within the ontology repository, as part of testing its usage of ontology-implementation vocabularies.

Requirements

Requirement 1

  • UCO must review its usage of standard ontology design and implementation vocabularies---including at least RDF, RDFS, OWL, and SHACL---to confirm concepts UCO uses are members of the vocabulary namespaces. This includes case-sensitivity of the IRIs.

Non-Requirements

This requirement is deferred for a future Issue, in order to postpone downstream-implementation logistics (particularly around pytest) and treat the current Issue as a fast-track proposal.

  • UCO's review mechanism must be consumable by UCO adopting ontologies, such as CASE.

Usage of ClosedNamespace for UCO concepts is left as out-of-scope of this Issue, as a matter of Python library support.

Usage of SHACL-SHACL is left as out-of-scope of this Issue, so this Issue may focus on concept typo review rather than standardized SHACL-based review of SHACL shapes.

Risk / Benefit analysis

This Issue is proposed for fast-track consideration because not having implemented it to date has been recognized as having caused errors. (See Background.)

Benefits

  • Classes of bugs that arise from incorrect concept references are prevented.
  • Certain cryptic error messages from ontology review tools are prevented.
  • Incorrect concept usage would be flagged by UCO Continuous Integration.

Risks

No appreciable risks are known.

Competencies demonstrated

Competency 1

UCO's implementation status as of 1.0.0 has certain incorrect concept references in it.

Competency Question 1.1

What are the incorrect concept references in UCO 1.0.0?

Result 1.1

The unit test's implementation notes these erroneous concept references:

  • http://www.w3.org/2002/07/owl#AnnotatedProperty
  • http://www.w3.org/2002/07/owl#AnnotatedSource
  • http://www.w3.org/2002/07/owl#AnnotatedTarget
  • http://www.w3.org/2002/07/owl#ontologyIRI
  • http://www.w3.org/2000/01/rdf-schema#DataType

Solution suggestion

  • Add a pytest test, using the RDFLib ClosedNamespace instances to confirm concept membership or flag non-membership. Non-membership must cause CI to report a failed test.
  • File one or mroe bugfix Pull Requests to address the reported issues. Also merge them into the branch bearing the new pytest.

Coordination

  • Tracking in Jira ticket OC-271
  • Administrative review completed, proposal announced to Ontology Committees (OCs) on 2022-10-11.
  • Solution announced to OCs on 2022-10-11.
  • Solutions Approval to be discussed in OC meeting, 2022-10-20
  • Solutions Approval vote occurred, passing, on 2022-10-20.
  • Solutions development phase completed.
  • Implementation merged into develop
  • Milestone linked
  • Documentation logged in pending release page

I should clarify something about the Oct. 20th vote - I suggest we have a Solutions Approval vote. The fixes to the listed bugs are trivial, except for excising owl:ontologyIRI from some tests. I request a vote so we don't wait for voting in case I get distracted from implementing that test. Feedback is welcome on this plan of action.