pkiraly/metadata-qa-api

Implementing SHACL constaints

Opened this issue · 0 comments

SHACL defines a set of constraint with the intention of validating RDF statements (see https://www.w3.org/TR/shacl/#core-components-value-type). I think that a subset of this vocabulary could be used in this framework as well, they define general rules to data element independent if they are expressed in RDF or elsewhere (JSON, CSV, XML, MARC21 etc.)

Here is the full list organized by categories

Value type

  • class - Specifies that each value node is a SHACL instance of a given type. The type of all value nodes. The values of sh:class in a shape are IRIs.
  • datatype - Specifies a condition to be satisfied with regards to the datatype of each value node. The datatype of all value nodes (e.g., xsd:integer). The values of sh:datatype in a shape are IRIs. A shape has at most one value for sh:datatype. List of datatypes: https://www.w3.org/TR/rdf11-concepts/#section-Datatypes.
  • nodeKind - Specifies a condition to be satisfied by the RDF node kind of each value node. The node kind (IRI, blank node, literal or combinations of these) of all value nodes. The values of sh:nodeKind in a shape are one of the following six instances of the class sh:NodeKind:
    • sh:BlankNode,
    • sh:IRI,
    • sh:Literal
    • sh:BlankNodeOrIRI,
    • sh:BlankNodeOrLiteral
    • sh:IRIOrLiteral.
      A shape has at most one value for sh:nodeKind.

Cardinality

  • minCount - Specifies the minimum number of value nodes that satisfy the condition. If the minimum cardinality value is 0 then this constraint is always satisfied and so may be omitted. The minimum cardinality. Node shapes cannot have any value for sh:minCount. A property shape has at most one value for sh:minCount. The values of sh:minCount in a property shape are literals with datatype xsd:integer.
  • maxCount - Specifies the maximum number of value nodes that satisfy the condition. The maximum cardinality. Node shapes cannot have any value for sh:maxCount. A property shape has at most one value for sh:maxCount. The values of sh:maxCount in a property shape are literals with datatype xsd:integer.

Value Range

  • minExclusive - The minimum exclusive value. The values of sh:minExclusive in a shape are literals. A shape has at most one value for sh:minExclusive.
  • minInclusive - The minimum inclusive value. The values of sh:minInclusive in a shape are literals. A shape has at most one value for sh:minInclusive.
  • maxExclusive - The maximum exclusive value. The values of sh:maxExclusive in a shape are literals. A shape has at most one value for sh:maxExclusive.
  • maxInclusive - The maximum inclusive value. The values of sh:maxInclusive in a shape are literals. A shape has at most one value for sh:maxInclusive.

String-based

  • minLength - Specifies the minimum string length of each value node that satisfies the condition. This can be applied to any literals and IRIs, but not to blank nodes. The minimum length. The values of sh:minLength in a shape are literals with datatype xsd:integer. A shape has at most one value for sh:minLength.
  • maxLength - Specifies the maximum string length of each value node that satisfies the condition. This can be applied to any literals and IRIs, but not to blank nodes. The maximum length. The values of sh:maxLength in a shape are literals with datatype xsd:integer. A shape has at most one value for sh:maxLength.
  • pattern - Specifies a regular expression that each value node matches to satisfy the condition. A regular expression that all value nodes need to match. The values of sh:pattern in a shape are literals with datatype xsd:string. The values of sh:pattern in a shape are valid pattern arguments for the SPARQL REGEX function.
  • flags - An optional string of flags, interpreted as in SPARQL 1.1 REGEX. The values of sh:flags in a shape are literals with datatype xsd:string.
  • languageIn - Specifies that the allowed language tags for each value node are limited by a given list of language tags. A list of basic language ranges as per [BCP47]. Each value of sh:languageIn in a shape is a SHACL list. Each member of such a list is a literal with datatype xsd:string. A shape has at most one value for sh:languageIn.
  • uniqueLang - Specifies that no pair of value nodes may use the same language tag. true to activate this constraint. The values of sh:uniqueLang in a shape are literals with datatype xsd:boolean. A property shape has at most one value for sh:uniqueLang. Node shapes cannot have any value for sh:uniqueLang.

Property Pair

  • equals - Specifies the condition that the set of all value nodes is equal to the set of objects of the triples that have the focus node as subject and the value of sh:equals as predicate. The property to compare with. The values of sh:equals in a shape are IRIs.
  • disjoint - Specifies the condition that the set of value nodes is disjoint with the set of objects of the triples that have the focus node as subject and the value of sh:disjoint as predicate. The property to compare the values with. The values of sh:disjoint in a shape are IRIs.
  • lessThan - specifies the condition that each value node is smaller than all the objects of the triples that have the focus node as subject and the value of sh:lessThan as predicate. The property to compare the values with. The values of sh:lessThan in a shape are IRIs. Node shapes cannot have any value for sh:lessThan.
  • lessThanOrEquals - Specifies the condition that each value node is smaller than or equal to all the objects of the triples that have the focus node as subject and the value of sh:lessThanOrEquals as predicate. The property to compare the values with. The values of sh:lessThanOrEquals in a shape are IRIs. Node shapes cannot have any value for sh:lessThanOrEquals.

Logical

  • not - Specifies the condition that each value node cannot conform to a given shape. This is comparable to negation and the logical "not" operator.
  • and - Specifies the condition that each value node conforms to all provided shapes. This is comparable to conjunction and the logical "and" operator.
  • or - Specifies the condition that each value node conforms to at least one of the provided shapes. This is comparable to disjunction and the logical "or" operator.
  • xone - Specifies the condition that each value node conforms to exactly one of the provided shapes.

Shape-based

  • node - Specifies the condition that each value node conforms to the given node shape. The node shape that all value nodes need to conform to. The values of sh:node in a shape must be well-formed node shapes.
  • property - Specifies that each value node has a given property shape.
  • qualifiedValueShape - The shape that the specified number of value nodes needs to conform to. The values of sh:qualifiedValueShape in a shape must be well-formed shapes. Node shapes cannot have any value for sh:qualifiedValueShape. This is a mandatory parameter of sh:QualifiedMinCountConstraintComponent and sh:QualifiedMaxCountConstraintComponent.
  • qualifiedValueShapesDisjoint - This is an optional parameter of sh:QualifiedMinCountConstraintComponent and sh:QualifiedMaxCountConstraintComponent. If set to true then (for the counting) the value nodes must not conform to any of the sibling shapes. The values of sh:qualifiedValueShapesDisjoint in a shape are literals with datatype xsd:boolean.
  • qualifiedMinCount - The minimum number of value nodes that conform to the shape. The values of sh:qualifiedMinCount in a shape are literals with datatype xsd:integer. This is a mandatory parameter of sh:QualifiedMinCountConstraintComponent.
  • qualifiedMaxCount - The maximum number of value nodes that can conform to the shape. The values of sh:qualifiedMaxCount in a shape are literals with datatype xsd:integer. This is a mandatory parameter of sh:QualifiedMaxCountConstraintComponent.

Other

  • closed - Specifies the condition that each value node has values only for those properties that have been explicitly enumerated via the property shapes specified for the shape via sh:property. Set to true to close the shape. The values of sh:closed in a shape are literals with datatype xsd:boolean.
  • ignoredProperties - Optional SHACL list of properties that are also permitted in addition to those explicitly enumerated via sh:property. The values of sh:ignoredProperties in a shape must be SHACL lists. Each member of such a list must be a IRI.
  • hasValue - Specifies the condition that at least one value node is equal to the given RDF term.
  • in - Specifies the condition that each value node is a member of a provided SHACL list.

The value types and spabe based constraints seems to be RDF specific. The rest seem to be good candidates for implementation.