Native SHACL support

Question

Native SHACL support

tpluscode opened this issue 5 years ago · 22 comments

Describe the requirement

SHACL has been hinted multiple times in various locations. I would like to initiate a focused discussion leading to its adoption in Hydra

SHACL Shape can be an alternative resource description to the hydra:Class.

Motivation

hydra:Class is inherently limited and it makes little sense to waste our limited resources on extending it. That would be wasted effort and just bloat.

Instead, SHACL is a well-recognised way to describe desired resources. Not only can it be used to drive the UI and API but is already being applied to validate payloads and also becomes integrated into triple stores and other tools.

Various potential usages are listed on the SHACL Use Cases and Requirements W3C note.

SHACL can do easily do all that we are currently missing or half-heartedly support:

cardinalities
closed sets of potential values
regular expressions
property paths
field ordering
(proposed) UI aid via DASH

Proposed solutions

There is only a simple change required to open a whole world of opportunities, including what has been discussed in #199. To loosen the range semantics of expects:

{
  "@id": "hydra:expects",
-  "range": "hydra:Class",
+  "schema:rangeIncludes": [
+    "hydra:Class",
+    "sh:Shape"
+  ]
},

Alternative solutions

For a moment I thought to keep the range but that doesn't seem right 🤷

{
  "@id": "hydra:expects",
  "range": "hydra:Class",
+  "schema:rangeIncludes": [
+    "sh:Shape"
+  ]
},

The problem with hydra:Class may be that it is a subclass of rdfs:Class but shape isn't. In fact, there may be multiple shapes which has a shared sh:targetClass. They do serve a different purpose.

Thus, I don't see a different solution right now.

Answer 1 · 2020-05-22T08:21:11.000Z

Isn't schema:rangeIncudes a schema:Class?

Answer 2 · 2020-05-22T08:33:56.000Z

I think you mean that its range is a schema:Class. Yes it is defined as

schema:rangeIncludes schema:rangeIncludes schema:Class .

Note that it still does not use rdfs:range. The difference is that rdfs:range has strong semantics. Broadly speaking, using the hydra:expects example, it means that

Every object of hydra:expects is implicitly a hydra:Class and thus a rdfs:Class

IMO this is not desired. With schema:rangeIncludes you don't get this effect. It is only a hint that the property objects may be one of the listed types. Or something completely different

Answer 3 · 2020-05-23T18:44:27.000Z

I like the idea of having SHACL on board - I was among those suggesting it in last years.

We just need to move carefully - removing the rdfs:range is a breaking change and we need to make it as least breaking as possible.

I also find it a bit surprising to modify hydra:expects - I though we've already made a modification allowing to have a hydra:resource so everything can be expected (including raw resources).

I'd look rather at hydra:property or hydra:supportedClass so these can accept SHACL constructs.

Answer 4 · 2020-05-23T19:54:32.000Z

Shoot, you're right. We already have changed the range and in fact also use `rangeIncludes` 👌 I was again misled by the discrepancy between the JSON-LD and HTML 🤦‍♂️. We should work on having them synchronised. That said, I would still propose to add `hydra:expects schema:rangeIncludes sh:Shape`. This way we would explicitly define the types which a generic client should handle. Otherwise, like you said, `hydra:Resource` is a totally open-ended fallback. I'm curious about your other ideas. While I don't think `supportedClass` fits SHACL, I would like to know what you had in mind for `hydra:property`

Answer 5 · 2020-05-25T07:04:24.000Z

Oh, what about IriTemplate mappings. Maybe they could also be described with a Shape?

The entire mappings could easily be swapped with a Shape, provided that individual PropertyShapes have an additional variable name which use use currently.

Here's an idea for changing EXAMPLE 19 from the spec

{
  "@context": "http://www.w3.org/ns/hydra/context.jsonld",
  "@type": "IriTemplate",
  "template": "http://api.example.com/issues{?q}",
  "variableRepresentation": "BasicRepresentation",
- "mapping": [
-   {
-     "@type": "IriTemplateMapping",
+ "mapping": {
+   "@type": "sh:NodeShape",
+   "sh:property": [
      "variable": "q",
-     "property": "hydra:freetextQuery",
-     "required": true
-    }
-  ]
+     "sh:path": "hydra:freetextQuery",
+     "sh:minCount": 1
+   ]
+ }
}

Little has to change conceptually.

hydra:IriTemplateMapping -> sh:PropertyShape
hydra:variable used with sh:PropertyShape
hydra:property -> sh:path
hydra:required -> sh:minCount

Of course, some limitations could be imposed for practical reasons:

only single level of shapes
no property paths

Maybe I should make a separate issue to discuss this?

Answer 6 · 2020-05-25T07:52:13.000Z

I was wondering whether there is any need to denote that a mapping can be a SHACL property - in RDF everything used as predicate is considered rdf:Property (correct me if I'm wrong), thus I see no point in explicitely saying that a sh:NodeShape and it's sh:property is used.
It won't matter whether the used IRI is an owl:FunctionalProperty, sh:property or raw rdf:Property.

I still see some benefits from using SHACL for cardinalities, but I think it is a way beyond this issue.

Answer 7 · 2020-05-25T08:32:04.000Z

[...] thus I see no point in explicitely saying that a sh:NodeShape and it's sh:property is used.

You propose to simply replace IriTemplateMapping with sh:PropertyShape?

It won't matter whether the used IRI is an owl:FunctionalProperty, sh:property or raw rdf:Property.

I'm confused. The owl and rdf terms are classes. sh:property is a property. I think you're misreading the snippet

I still see some benefits from using SHACL for cardinalities, but I think it is a way beyond this issue.

And the cardinality is only because SHACL does not have a required constraint. Might keep hydra:required. Why not

Answer 8 · 2020-05-25T08:54:08.000Z

No, I'd leave it as it is now. I wouldn't introduce new constructs just for SHACL. I think IriTemplateMapping is good enough to cover any scenario.
Consider this:

my:Class a rdfs:Class .
my:predicate a rdfs:Property;
    rdfs:domain my:Class .

my:Class a owl:Class .
my:predicate a owl:FunctionalProperty;
    rdfs:domain my:Class .

my:Class a sh:NodeShape;
    sh:property [
        sh:path my:predicate
    ] .

All of the above snippets defines same concept (more or less) of a class my:class with predicate my:predicate. From mapping point of view, you'll have an RDF graph of...:

some:resource my:predicate "value .

...in which case it won't matter how the my:predicate is defined. You have to grab it's value and put it in the template. Simple as that.

Answer 9 · 2020-05-25T10:06:08.000Z

I am completely lost. For one, I don't understand why you bring in the rdfs:Class and owl:Class. They are not even used with IRI templates.

By using SHACL for both hydra:expects as well as hydra:mapping we gain uniformity. Same shared W3C standard to define the request graphs as well as the graph mapped to the template variables.

I think IriTemplateMapping is good enough to cover any scenario.

Surely, SHACL has features which we cannot currently match. Such as cardinalities, sh:order, sh:group and much more.

I would hardly want to reinvent that in hydra: namespace. Much less use OWL

Answer 10 · 2020-05-25T11:15:05.000Z

I am completely lost. For one, I don't understand why you bring in the rdfs:Class and owl:Class.

All of these, SHACL, RDFS, OWL (and maybe other not mentioned here) allows you to describe data structures that an API can use (among other things that you can do with those vocabularies).

They are not even used with IRI templates.

These were just examples to give a broader context of property definition. In SHACL you define a property in a shape's context, in RDFS/OWL you don't need a class, but I've added it to show the analogy.

Such as cardinalities, sh:order, sh:group and much more.

I see no way how sh:order and sh:group could be used in scope of an Iri template. Does it have anything to do with variable representation?
The only usable feature would be cardinality mentioned, but there is that required property that fits most of the cases.

Answer 11 · 2020-05-25T11:29:23.000Z

I see no way how sh:order and sh:group could be used in scope of an Iri template. Does it have anything to do with variable representation?

No but you may want to create UI for the form to take user input.

The only usable feature would be cardinality mentioned

There is definitely more to SHACL. For example sh:in to define choice sets, basis for extensible contstraints and other proposed extensions from DASH

Answer 12 · 2020-05-25T11:31:13.000Z

I'm curious about your other ideas.
While I don't think supportedClass fits SHACL,
I would like to know what you had in mind for hydra:property

I was wondering on how to support various ways of how supported data structures could be hinted to the client. Currently hydra:supportedClass accepts hydra:Class, which is rdfs:Class which may raise SHACL shapes to that level. While I personaly see no harm with a SHACL shape being a hydra:Class simultanously, this may be unwanted behavior in some cases (what cases?).

I was also thinking on how the client would tell a server on it's preferences regarding data structure description (I've mentioned various ways of describing those in my previous post). Maybe a Prefer header? I'd love to see hydra as a common framework driving API, but I hate seeing several hydra dialects applied here and there so no generic client can talk to all of them.

Answer 13 · 2020-05-25T11:42:15.000Z

Class and Shape are orthogonal. A shape can, via sh:targetClass refer to an rdfs:Class it describes but there is no 1:1 relation. You will likely have multiple shapes which define a single graph.
Slightly different semantics. With supportedClass it says "those are the Classes the consumer is likely to find in an API".

I was also thinking on how the client would tell a server on it's preferences regarding data structure description

If we agree to add Shape to range of expects I would have a 100% compliant client understand both.

In the long run could even see hydra:Class deprecated (discouraged) as it unnecessarily overlaps the features of SHACL. I think it's a waste of time to maintain it where SHACL can be used. Would only need small extensions to SHACL where it is not enough (hydra:required?)

Answer 14 · 2020-05-25T12:16:38.000Z

I don't know SHACL that much - all I can see is that targets are used to create "node focus" for validation (i.e. which parts of the graph should validated against a shape).

I was also thinking on how the client would tell a server
on it's preferences regarding data structure description

If we agree to add Shape to range of expects I would have a 100%
compliant client understand both.

I don't see how those two are related to each other. I was thinking on how client could ask server to describe it's data structures with desired vocabulary. Expectations are opposite - client describes something to the server.

Answer 15 · 2020-05-25T12:28:33.000Z

I don't see how those two are related to each other. I was thinking on how client could ask server to describe it's data structures with desired vocabulary.

Quite interesting, but not what I had in mind indeed. I am only considering the API Documentation and not the resources themselves.

What I meant is that a client if we have

hydra:expects schema:rangeIncludes hydra:Class, sh:Shape

then a compliant client will have to understand both ways in which an API might describe its operations. And anything else we add to this schema:rangeIncludes would have to be supported out of the box.

Answer 16 · 2020-05-25T14:15:02.000Z

I've had issues with expect support for things that cannot be dereferenced (say, any schema.org class). I've consoled myself into thinking i'll just ignore that part, and break people's expectation a bit (well, yeah sure try and dereference it, I won't give you an @id), and instead use any rdfs:Resource I wish, cause there's no other way to use expect in nearly any of the scenarios I encounter. returns is the same boat.

I have had the same issue, which i've described at length before, where I do wish to document those things alongside the ApiDocumentation documents, so that a browser could do useful displaying of those. The same issue with supportedClass / hydraClass prevents me from doing that there, so after consulting with some of you, I've also gone back to adding my own rdfs:Resource there, as an ApiDocumentation extension, which is completely out of specficiation and will only work for us, and as it's not in spec it's probably never gonna be interoperable. The specification itself never says that a browser or a client should know about rdfs at all, which makes it impossible for interop to exist, as no one knows what it would look like.

All this to say two things: I don't really like breaking changes if I can avoid them, as like most devs I work on spec numbers, and continuous specs are a nightmare, and I'd like to have a plan to a 1.0 before I fork my work so I can actually support my API commercially over time.

Secondly, and here I quote the specification:

Hydra is a lightweight vocabulary to create hypermedia-driven Web APIs.

Hydra is about adding hypermedia. It looks to me like it's never been tied to any schema system explictly so far, and I think it would benefit from not doing so.

Since Hydra uses classes to describe the information expected or returned by an operation, it also defines a concept to describe the properties known to be supported by a class

I think that's no longer correct anyway, but even so, the premise itself I believe is flawed.

There's RDFS, OWL, SHACL, but also (json schema in rdf)[https://www.w3.org/2019/wot/json-schema] (and json schema has rather wide industry support out there), and if i follow the spec today, i'm very much lost: except for properties to hydra:Resources being needed or not (round hole square peg problem, as defined above), schemas are not talked about and left as an exercise in creativity, also as described above. That renders the majority of my APIs not compliant as it exist.

If we move everything to SHACL, then that's a breaking change, and the spec is still kinda neither here nor there. The current spec doesn't work for me, due to not providing the simple features to document schemas for things that are not dereferencable.

I would suggest relaxing payload descriptions and supported classes to non-hydra types that could be then described independenlty and more effectively by a schema language of choice that would make sense to the industry in which hydra is used? Those could be additional lightweight specifications "Json Schema in Hydra APIs", "RDFS and SHACL in Hydra APIs" etc.

Alternatively, one specification that covers describing all types that an API may use, with the minimum set of required if people feel that's useful, and then a controlled way to expand those hydra-defined things with additional schema (a binding to json schema, additons opf shacl shapes, etc). Let me read, and implement Hydra alone, in a client and in a server, without having to bring in the whole RDF world with me.

I'll add one final note: reusable generic hypermedia clients have not received much love on most platforms, and making it harder to create one is not gonna help anything.

Answer 17 · 2020-05-25T14:55:46.000Z

Although I agree with @tpluscode in that SHACL solves the problems Hydra tries to solve better, I also have to agree with @serialseb here. We are specifying the Core vocabulary of Hydra, which I think implies being as concise and free of dependencies as possible.

Instead of baking in any defaults for schema and shape validation, it would be great if Hydra Core allowed extension points that could be implemented by SHACL, OWL, JSON Schema, Relax NG, XSD or any other schema mechanism.

I'm not even sure Hydra should provide its own MVP schema language out of the box, or simply define that as out of scope for the Core specification and delegate it to one of possibly many extension specifications.

An extension specification that bridges SHACL and Hydra would get my vote, though. And I do think such a spec could be embraced as an official recommendation for a "Hydra Standard Profile" (a spec that defines which extensions to prefer for each of Hydra's extension points), but not as a requirement for Hydra Core.

Thoughts?

Answer 18 · 2020-05-25T15:04:54.000Z

I'm not even sure Hydra should provide its own MVP schema language out of the box, or simply define that as out of scope for the Core specification and delegate it to one of possibly many extension specifications.

I was just thinking that indeed right now. I did hint above that anything we reference in the Core spec should be implemented by a generic client. But we also cannot have a client implement all possible schema options.

I'm not even sure Hydra should provide its own MVP schema language out of the box

Right... It hardly makes sense to maintain a built-in model (hydra:Class et.al.) if we know that it is a wasted effort on a sub-par solution which is likely not enough for any serious application.

it would be great if Hydra Core allowed extension points that could be implemented by SHACL, OWL, JSON Schema, Relax NG, XSD or any other schema mechanism.

The vocabulary is already open enough to allow anything to be used with hydra:expects. Might continue the discussion of template mappings (which I consider forms-lite) to also be not limited to hydra terms and we'd be almost at the finish line.

Answer 19 · 2020-05-25T16:20:27.000Z

Right... It hardly makes sense to maintain a built-in model (hydra:Class et.al.) if we know that it is a wasted effort on a sub-par solution which is likely not enough for any serious application.

Exactly. What do you say, @serialseb? Is the value provided by hydra:Class enough to warrant its existence in the Core vocabulary, or should it be removed altogether, delegating schema and validation to external tooling and standards?

The vocabulary is already open enough to allow anything to be used with hydra:expects. Might continue the discussion of template mappings (which I consider forms-lite) to also be not limited to hydra terms and we'd be almost at the finish line.

It would be great to see an attempt at specifying the bridge between Hydra and at least two different schema languages, just so we can get a clear picture of how that interaction is going to look like. If there are any wrinkles, we should iron those out before Hydra Core reaches v1.0.

Answer 20 · 2020-05-25T19:27:13.000Z

I've had issues with expect support for things that cannot be dereferenced

This should be fixed now as both hydra:expects and hydra:returns has hydra:Resource in their range (a class of dereferencable resources, which I think should be true in most cases).

say, any schema.org class

Terms from schema.org are quite dereferencable - HTML returned contains some markup that carries RDF details (I don't remember whether it was Microdata or RDFa). While not that common and difficult to process by clients, it's RDF.

I have had the same issue, which i've described at length before, where I do wish to document those things alongside the ApiDocumentation documents, so that a browser could do useful displaying of those.

I think that is not the purpose of hydra. You should use any of the mentioned schemas here to describe your data. Hydra should enable you to make some of those displayed things links or buttons.

The specification itself never says that a browser or a client should know about rdfs at all

Good point. I just think that when working with RDF, rdfs is a must. Basic RDF concepts are formalized in RDFs, thus simple use of rdfs:type enforces a client to understand RDFs.

All this to say two things: I don't really like breaking changes if I can avoid them

I agree. That's why I tend to hold some horses here :)

It looks to me like it's never been tied to any schema system explictly so far, and I think it would benefit from not doing so.

That's right - hydra never coupled itself with any specific external vocabulary. I prefer to enable hydra for extension rather than to bind it to some other vocab.

If we move everything to SHACL, then that's a breaking change

I don't think it'll gonna happen.

I would suggest relaxing payload descriptions and supported classes

That's why I've pointed to it at the very beginning of the discussion: I'd look rather at hydra:property or hydra:supportedClass so these can accept SHACL constructs.

I'm not even sure Hydra should provide its own MVP schema language out of the box

Let me help you - it shouldn't. I don't think hydra should come with neither data structure definition language nor query language.

An extension specification that bridges SHACL and Hydra would get my vote, though. And I do think such a spec could be embraced as an official recommendation for a "Hydra Standard Profile".

My initial thought on using SHACL was a soft recommendation within the spec but some RDF constructs opening hydra for extensions are welcome.

Answer 21 · 2020-05-25T19:40:47.000Z

To wrap up things, we've got currently 3 areas where SHACL (and possible other vocabs) could be used alongside with hydra:

data structure definition - this should live with hydra side-by-side; hydra comes with supportedClass and supportedProperty which should point to classes and properties described by SHACL, OWL, RDFs, you name it; I believe slight modification of supportedClass's range should enable those vocabs
expects/returns
IRI template mapping - I personally see no point in using something else here

Answer 22 · 2021-06-17T09:21:51.000Z

I created PR HydraCG/extensions#8 to start writing down SHACL bits

Please have a look