Use case: Ordered lists as an expected data format
danielbeeke opened this issue ยท 12 comments
We talked in the meeting on the 23th of August about ordered lists:
It is painful to implement ordered lists at the moment. But they give a great aspect of the data; the order. Inside my form renderer I have added a way to sort ordered lists but it was quite painful.
sh:property [
sh:name "Author reference"@en ;
sh:path ( schema:author [ sh:zeroOrMorePath rdf:rest ] rdf:first ) ;
]
It would be great if we could have something like this instead:
sh:property [
sh:name "Author reference"@en ;
sh:path schema:author ;
OUR_NAMESPACE:isOrderedList true ;
]
In my form renderer I did a similar abstraction. When I detect an sh:path that specifies an ordered list for rendering the form I replace it to a normal sh:path and send the fact that is should be an ordered list to dash:editor implementation. I would be nicer to do this in the SHACL.
Questions:
- What would be a good predicate (alternative for 'isOrderedList')?
- How have you worked around ordered lists in SHACL? Do you have something to share about this topic?
References
None of the work-arounds are perfect but it remains an important problem. rdf:List is a terrible but necessary solution from the last century.
We have introduced a property dash:index that can be used together with reification/rdf-star. The benefit is that the values remain "normal" triples that can be accessed with set-based semantics, yet can also be queried in order if needed. Properties are marked with dash:indexed true, and constraints check that all indices from 0..N are present. Pain points include the cost if inserts and look-ups still require O(n).
@HolgerKnublauch I could not find documentation about dash:index only this part in the ontology.
dash:IndexedConstraintComponent
a sh:ConstraintComponent ;
rdfs:comment "A constraint component that can be used to mark property shapes to be indexed, meaning that each of its value nodes must carry a dash:index from 0 to N." ;
rdfs:label "Indexed constraint component" ;
sh:parameter dash:IndexedConstraintComponent-indexed ;
.
dash:IndexedConstraintComponent-indexed
a sh:Parameter ;
sh:path dash:indexed ;
dash:reifiableBy dash:ConstraintReificationShape ;
sh:datatype xsd:boolean ;
sh:description "True to activate indexing for this property." ;
sh:maxCount 1 ;
sh:name "indexed" ;
.
Could you elaborate what reification/rdf-star would do?
Assuming you have 3 children ordered by date of birth, I believe in the current RDF-star syntax draft it would look like
ex:Parent
ex:child ex:Child1 {| dash:index 0 |} ;
ex:child ex:Child2 {| dash:index 1 |} ;
ex:child ex:Child3 {| dash:index 2 |} ;
.
In our model, we went for a slightly different approach. For example when defining the sh:PropertyShape
for sh:languageIn
, we have done something like
<PropertyShape/languageIn>
a sh:PropertyShape ;
sh:path sh:languageIn ;
hanami:listOf [
sh:datatype xsd:string ;
] ;
.
And when going through validation, this sh:PropertyShape
is expanded into something that loosely look like this (inspired by the "shapes of shapes" from the SHACL specs)
<PropertyShape/languageIn>
a sh:PropertyShape ;
sh:path sh:languageIn ;
sh:nodeKind sh:BlankNodeOrIRI ; # this is new
sh:node <GeneratedUri/1> ; # this is new
hanami:listOf [ # ignored by validation engine, we leave it here
sh:datatype xsd:string ;
] ;
.
<GeneratedUri/1>
a sh:NodeShape ;
sh:property [
sh:path [ sh:zeroOrMorePath rdf:rest ] ;
sh:hasValue rdf:nil ;
sh:node <GeneratedUri/2> ;
]
.
<GeneratedUri/2>
a sh:NodeShape ;
sh:or (
[
sh:hasValue rdf:nil;
sh:property [ sh:maxCount 0 ;
sh:path rdf:rest
] ;
sh:property [ sh:maxCount 0 ;
sh:path rdf:first
]
]
[
sh:not [sh:hasValue rdf:nil];
sh:property [
sh:path rdf:first ;
sh:maxCount 1 ;
sh:minCount 1;
sh:datatype xsd:string ; # the content of `hanami:listOf` is copied here
] ;
sh:property [
sh:path rdf:rest ;
sh:maxCount 1 ;
sh:minCount 1 ;
]
]
)
.
So that it can be processed by vanilla SHACL validators.
We also considered using something like OUR_NAMESPACE:isOrderedList true
, but this would then be wrongly interpreted by vanilla SHACL validators. For example if we had this in a data graph
<TitlePropertyShape>
a sh:PropertyShape ;
sh:path ex:title ;
sh:datatype rdf:langString ;
sh:languageIn (en fr) ;
.
and as sh:languageIn
definition in the shapes graph
<PropertyShape/languageIn>
a sh:PropertyShape ;
sh:path sh:languageIn ;
sh:datatype xsd:string ;
OUR_NAMESPACE:isOrderedList true ;
.
And we pass those to a validation engine, it will most likely say that <TitlePropertyShape>
has some violations because sh:languageIn
is not pointing to xsd:string
values, but to an rdf list node.
Also this solution works nicely for even more complex scenarios (at least in our use cases ๐), like so
<PropertyShape/or>
a sh:PropertyShape ;
sh:path sh:or ;
hanami:listOf [
sh:nodeKind sh:BlankNodeOrIRI ;
sh:or (
[
sh:class sh:PropertyShape ;
]
[
sh:node <PropertyShape> ;
]
) ;
] ;
.
@WilliamChelman that is a nice way. The preprocessing is an elegant workaround.
What I see missing from this discussion is the potential need to cater for viewers and editors, as well as validation.
sh:path ( schema:author [ sh:zeroOrMorePath rdf:rest ] rdf:first ) ;
works for viewing. It may be awkward but it can be supported already using existing specs
The proposed [ sh:path schema:author ; OUR_NAMESPACE:isOrderedList true ]
has a drawback that if an implementation does not understand OUR_NAMESPACE:isOrderedList
it will inevitable render UI for creating a set of schema:author
objects.
I think I like the hanami:listOf
solution where that property could be used directly by a UI builder and would otherwise be ignored by builders which does not understand it.
sh:property [
sh:name "Author reference"@en ;
hanami:listOf [ sh:path schema:author ] ;
]
The above would instruct a builder to render an editing UI which sets an RDF List to schema:author
. The UI could be a draggable list of another component (dropdowns, etc) or, in an optimized case, a text area where each line becomes an separate literal
We could define the shape for lists in our namespace. Then, we would have fixed IRI that can be used to identify lists. UI components can just ignore the content of the shape. Validators don't need to change the logic. It just requires an owl:import
.
To identify that a property has rdf:Lists as values, we use
sh:node dash:ListShape
see https://datashapes.org/dash.html#ListShape
With that stable URI, widgets can more easily identify lists than relying on parsing rdf:rest etc.
I propose we copy the dash:ListShape
to the new namespace. Please vote till the 18th of October.
dash:ListShape
only says : "this is an RDF list", but does not say "this is a list OF WHAT". How do you tell this is a list of xsd:string, as in the provided example by @WilliamChelman ?
Should be easy to define a constraint component with a constraint such as dash:listMemberClass and dash:listMemberDatatype or dash:listMemberType
Other constraints can be defined with the path ( [ sh:zeroOrMorePath rdf:rest ] rdf:first )
as shown in the example below (source: https://archive.topquadrant.com/constraints-on-rdflists-using-shacl/)
In the last call, I had the idea that we could also define an IRI as the root of the list with the fixed path. But I missed that this would lead to a named node object value, which will be directly interpreted as path.
ex:TrafficLightShape
a sh:NodeShape ;
sh:targetClass ex:TrafficLight ;
sh:property [
sh:path ex:colors ;
sh:node dash:ListShape ;
sh:property [
sh:path ( [ sh:zeroOrMorePath rdf:rest ] rdf:first ) ;
sh:datatype xsd:string ;
sh:minLength 1 ;
sh:minCount 2 ;
sh:maxCount 3 ;
]
] .