zazuko/rdf-validate-shacl

Shape inheritance only works when single dataset is used

tpluscode opened this issue · 5 comments

The validator supports shape hierarchies using rdfs:subClassOf but they are not applied correctly if the data graph and shapes graph are separate instances of Dataset

Consider this example.

The result is false-positive, even though the validated instance should be validated as foaf:Agent.

The correct results is returned if a single dataset is populated, for example using actual named graphs:

const dataset = $rdf.dataset()
const dataGraph = clownface({ dataset, graph: $rdf.namedNode('data-graph') })
const shapesGraph = clownface({ dataset, graph: $rdf.namedNode('shapes-graph') })

For inheritance to work, your "ontology" (rdfs:subClassOf) must be included in the "data" dataset. I think it makes sense, doesn't it?

No, I don't think it does. Class hierarchy is not property of data itself but of the its meta model (classes).

Holger suggest to run the validation over the union graphs which does work already as I mention above. The only problem is when the two RDF/JS datasets are separated.

I would consider merging them in memory (maybe opt-in, to prevent copying large amounts of data). That would be transparent to the caller, whether they use one or two datasets as input.

I like the idea of merging the datasets inside the validator, but I'm afraid of the consequences, mostly regarding blank nodes.

We're back to the same considerations: if we can agree that users should not expect blank nodes to keep their ID in the final "validation report" dataset, changing these IDs during validation shouldn't be an issue.

I think blank nodes will not be a problem if the "merging" happened in named graphs inside the joint dataset.

The only requirement would be address them as (bnode, graph). If clownface is used, that would already be taken care of by using pointers to specific graph

I tried to evaluate how easy this change would be and I think it will require a major refactoring: mostly passing clownface pointers around instead of nodes without context.