HydraCG/Specifications

How to document forbidden dereferencability

alien-mcl opened this issue · 32 comments

Documenting forbidden de-referencability

In some cases there is a need to disallow client from de-referencing a resource. Currently there is no way of doing that and hydra:Resource implies a resource is (at least virtually) de-referencable (client can i.e. HTTP GET it).

When a resource has a callable IRI (http[s] scheme based one), server may know that it won't support GET operation on this URI, thus it may be required to document that fact.

Proposed solutions

I currently see 3 viable solutions.

  1. Downgrade rdfs:range of hydra:expects and hydra:returns (and possibly hydra:supportedClass?) to rdf:Resource instead of hydra:Resource. According to W3C HTTP specification client cannot assume that HTTP/HTTPS based URIs can be called with GET method, thus API documentation won't encourage client to do so as opposite to currently used hydra:Resource.
    This may be a breaking change as it may affect existing hydra driven implementations, especially those using RDF reasoning process. This makes it simple to introduce external schema vocabularies like SHACL, OWL and non-rdf based ones.

  2. Introduce new term that explicitly forbids client from de-referencing a resource. This won't break anything that is using current hydra version and is somehow consistent with RDF (you need to explicitly say something) and linked data paradigm. It may also open gates for future features like disallowing operations or operation precedence/hiding.

  3. Introduce hydra vocabulary profiles or levels, i.e. level 0 and 1 where level 1 would be a default current hydra version and level 0 would be an alternative that does not imply a hydra:Resource is de-referencable (or event RDF based). This could also enable hydra for future profiles/levels using external schema vocabularies like SCHAL, OWL and non-rdf based ones.

I would definitely have 1️⃣ over 2️⃣


The third point however I find rather orthogonal. I'm not sure the "profiles" have anything to do with dereferencability of the API descriptions. More about the requirements from the client. I would see this as an open-ended set of "feature-packs" that a server may implement in addition to core. Here are some ideas:

  1. Hydra Core

    • Only the base terms to put an API together
  2. Standard Profile

    • hydra:Class
  3. SHACL profile

    • Shapes for expects/returns
    • maybe also supportedClass
  4. JSON Schema profile

    • like SHACL, uses a different vehicle to describe requests and responses
  5. IANA Media types profile

    • Uses IANA-registered media types. for example the snippet below might denote an operation where PNG and JPG are allowed to be uploaded
    <> hydra:expects 
        <https://www.iana.org/assignments/media-types/image/png> ,
        <https://www.iana.org/assignments/media-types/image/jpeg> ,
    
  6. Multi-part profile

    • Something we discussed in #199 where an operation is described as expediting a multipart/form-data body.
    • Individual parts could then be individually describe using the other profiles. For example to have one part image (IANA profile) and the other a plain RDF resource (Standard profile)

Each profile would be free to define its specific processing rules.

The required changes would be to:

  1. Add an optional property on ApiDocumentation to assert which profiles a server uses
  2. Relax some rdfs:range statements of extension points
  3. Move the "Standard Profile" to a separate spec document. Would probably keep

So as effect a server might announce itself as

<> a hydra:ApiDocumentation ;
  // SHACL might potential be maintained by Hydra CG
  hydra:profile <http://www.w3.org/ns/hydra/profile#SHACL> ;
  // a vendor-specific profile which somehow bridges Open API
  hydra:profile <http://example.com/hydra-profile#OpenAPI>

I'm not sure the "profiles" have anything to do with dereferencability of the API descriptions.

Not everything can be expressed with raw RDF terms definition - that's why the spec exists. Either profiles or levels could allow to make parts of the spec interpreted differently. While default level could have a hydra:Resource interpretation left as it is now (resources can be called with GET), but at different levels this assertion could be relaxed or removed.

Add an optional property on ApiDocumentation to assert which profiles a server uses

It might be not as easy as it sounds. There can be multiple API documentation links, what if every part provides different value? We also need something for in-lined hypermedia controls and on HTTP header level for pure JSON with JSON-LD context provided on that level. We might need to define precedence or scope of such a profile assertion.

Move the "Standard Profile" to a separate spec document. Would probably keep

I don't like it - I'd prefer to keep it as it is now.

It might be not as easy as it sounds. There can be multiple API documentation links, what if every part provides different value?

I think that for the most part it actually is simple. If you consume multiple APIs (for example via multiple link headers), the client will have to understand all profiles used by all those APIs. No way around it.

Good point about HTTP headers, but IMO same principle applies. The client will have to support the sum of all profiles.

We also need something for in-lined hypermedia controls

Inlined controls will use the same descriptions, no? If API uses profiles A and B then you might expect those profiles applied to inline hypermedia. Am I missing something?

Move the "Standard Profile" to a separate spec document. Would probably keep

I don't like it - I'd prefer to keep it as it is now.

No strong opinion on my part. I can definitely live with that 😉

I think you are coming from the perspective of negotiating profiles. Do I get that right?

I would not consider this. An API chooses to use a certain method of describing its hypermedia and it is up to the client to support it. Not that a client can ask for the descriptions it "likes".

Negotiating the hypermedia controls is totally impossible because profiles will likely be incompatible with one another. So either a client supports your profile or it will be unable to (fully) consume your API. I don't think there is a way around this.

I agree with @tpluscode, Hydra should not worry about negotiating profiles. If that should at all be possible, there are other initiatives tackling that problem.

Inlined controls will use the same descriptions, no? If API uses profiles A and B then you might expect those profiles applied to inline hypermedia. Am I missing something?

Imagine a situation when API documentation declares a profile A, but inlined controls are using profile B. Which one is in force?

I would not consider this. An API chooses to use a certain method of describing its hypermedia and it is up to the client to support it.

I think it's the opposite. Hydra claims to provide interoperability with minimum terms and maximum extensibility. I can imagine a situation server can provide same information provided using different vocabularies (or non-RDF ways). I think it may be valuable to enable the client to express it's preferences (which then can be taken into account or not - it's up to the server).

Hydra should not worry about negotiating profiles.

I don't think we should create something to support it - the link you @asbjornu provided is interesting and may be a good starting point.

I think I can see both terms level and profile used simultanously - level for spec interpretation, profile for extensibility.

Imagine a situation when API documentation declares a profile A, but inlined controls are using profile B. Which one is in force?

You miss the point. If the API states that it uses profile A a client will only expect such controls. Note that this will likely be discovered by humans and not machines.

Now, if such an API in fact uses another profile then a client will likely fail to understand them or it might not care if it supports profile B nonetheless.

Again, it's not about precedence. The mention API uses profiles A (explicitly) and B (implicitly). The client always has to support all of them.

I think it may be valuable to enable the client to express it's preferences (which then can be taken into account or not - it's up to the server).

Ok, I can see the preference bit now. But I'm not sure about practical usefulness. Would it be something like this?

Client: Hey, please given me Resource /XYZ. I understand profiles A and B by te way

Server: Sorry mate. I use profile C for my hypermedia controls

So what should happen now? 429 Precondition Failed?

I actually think of this in an opposite way, where a client retrieves an API Documentation and finds that the API uses certain profiles. It would then load the necessary code to handle them or fail it those profiles are not supported.

This way a client can be modular and not bloated with code not used for a given API.

I think I can see both terms level and profile used simultaneously - level for spec interpretation, profile for extensibility.

Is it possible that you're overly complicating the proposal? I'd rather that the levels were unnecessary and that the core semantics are uniform.

You miss the point.

Not really - if a profile can be expressed in multiple places (API Documentation, inlined, headers), it may happen that the declared profile may change. What then? Understanding all of them is one approach. Visibility and hiding would be another. Don't know which one is better/worse.

So what should happen now? 429 Precondition Failed?

It depend on how we would specify the mechanism, but 412 Precondition Failed is not in scope (it is applicable to If-Unmodified-Since or If-None-Match). I was thinking about Prefer header, but it's just a hint.

Is it possible that you're overly complicating the proposal? I'd rather that the levels were unnecessary and that the core semantics are uniform.

I just have to care about backward compatibility - introducing levels could save us from tampering with ranges as the interpretation of hydra:Resource may vary between levels.

I think the possible issues you mention exist regardless of profiles and is impossible to do anything about with Hydra. An API can change regardless of the profile. It doesn't even need a profile to change. And since profiles are identified by URI, anyone can create one. We can't control that, not within Hydra or anywhere else.

This has been the case for open-ended extensibility for as long as it has existed. It was like this for XML namespaces, it's the same for RDF where anyone can mint a URI to declare a new term and it's the same with profiles. What we can control is how these extension mechanisms works and then provide some guidance and recommendations on how to develop Hydra profiles and which we officially embrace and not. Other than that, it's all up to the Hydra users to decide.

I hope we can all agree that Hydra Core is the minimum of what a client must understand in order to interoperate with a Hydra-capable API. The profiles just extend Hydra Core, so a client can always fall back to Hydra Core to follow its links and perform its operations in a rudimentary but compliant way. So not understanding a profile is not the end of the world, it's just going to provide a worse user experience.

We can't control that, not within Hydra or anywhere else.

I never aimed to control that. I just have to put all the ideas to extreme situations in order to find out (or at least to get a hint of) how the spec behaves before we modify the spec.

The profiles just extend Hydra Core

It is still an idea - let's not be hasty in assuming profiles does exist in hydra as it is now.

so a client can always fall back to Hydra Core

This is my aim - it is still possible to commit an unrecoverable mistake on the design stage which could render that very fallback impossible. I want to avoid situation when there are several hydra dialects so no client can talk to more than one with same base.

so a client can always fall back to Hydra Core

This will often be not an option? If the client finds and operation which expects some SHACL Shape and it does not understand SHACL, how would you "fall back to Hydra Core"?

If the client does not understand a profile, it will not be able to perform operations described using that profile. There is not much more to it, is there?

I'm actually not afraid of SHACL - it's still an RDF. Currently hydra client does not validate input of an operation against expected class, thus it may be the same for SHACL. I'm more afraid of those non-RDF possibilities.

I never aimed to control that. I just have to put all the ideas to extreme situations in order to find out (or at least to get a hint of) how the spec behaves before we modify the spec.

Ok, fair enough.

The profiles just extend Hydra Core

It is still an idea - let's not be hasty in assuming profiles does exist in hydra as it is now.

No, profiles exist regardless of what Hydra decides to do in this space. Officially embraced profiles may not exist without the Hydra CG's seal of approval, but profiles (just like XML namespaces, RDF terms or anything else based on a URI) can be created by anyone.

so a client can always fall back to Hydra Core

This is my aim - it is still possible to commit an unrecoverable mistake on the design stage which could render that very fallback impossible. I want to avoid situation when there are several hydra dialects so no client can talk to more than one with same base.

That is only possible to a certain degree, though. We are shooting for much more than the moon if we think we can make everything fallback gracefully.

so a client can always fall back to Hydra Core

This will often be not an option? If the client finds and operation which expects some SHACL Shape and it does not understand SHACL, how would you "fall back to Hydra Core"?

Excellent point. This goes a bit back to the content negotiation bit again. There will be situations where a server and client can't find an agreeable format in which to communicate here, just as it exists on the web at large.

Some clients don't understand PDF and some servers have no way to translate it into something a client actually understands. That's as applicable to Hydra (although not the PDF bit, perhaps) as it is to a text terminal browsing a website of legal documents.

If the client does not understand a profile, it will not be able to perform operations described using that profile. There is not much more to it, is there?

I was thinking that a lot of the vocabulary in Hydra Core will allow a client to interoperate with the API without understanding every part about the API. Like a browser that only understands CSS level 1 may be able to render color, fonts, borders, but very little layout and everything will most likely look pretty horrible. It should still work, though. Agreed?

I'm actually not afraid of SHACL - it's still an RDF.

It doesn't matter if it's RDF.

The client needs to understand the profile to understand whatever semantics a given expects or anything else in order to create the request message. If it doesn't, then there is now way to fall back.

It should still work, though. Agreed?

See above. Yes, some pf the API will work up to a point where the client cannot perform a request described in ways it does no understand 🤷. Surely, it does not mean that the entire API is unusable but YMMV

That said, if the client does communicate profiles that it does understand, the server might have the opportunity to resort to a less expressive alternative.

SHACL -> Hydra Core is definitely an option.
IANA media type profile could be impossible to "downgrade"

I would only just say that I would phrase this as a MAY in the spec. As in something like

the server MAY ignore the client's capabilities and use unsupported profiles
or omit operations which the client will not understand

It doesn't matter if it's RDF.

To some degree it does. I think this may be one of the reasons IANA media type profile could be impossible to downgrade as it may not be machine-readable.

the server MAY ignore the client's capabilities and use unsupported profiles
or omit operations which the client will not understand

I don't think it's good approach. If client doesn't understand something, it should ignore it.

No, profiles exist regardless of what Hydra decides to do in this space

These are not profiles - these are extensions. I'd call profile something we could ebrace in spec.

Anyway, lets get our discussion back to the topic - which solution and how could make it happen?

I don't think it's good approach. If client doesn't understand something, it should ignore it.

One does not contradict the other. If the server knows that a client won't understand IANA media types, it may chose omit them from the responses. If it doesn't then the client will indeed ignore them. Two sides of same coin.

These are not profiles - these are extensions. I'd call profile something we could ebrace in spec.

👌 I'm fine with calling them extensions. We did in fact refer to "extensions points" in the past.

So profile would be closer to what you called level above?

Anyway, lets get our discussion back to the topic - which solution and how could make it happen?

I repeat, I think that removing rdfs:range is the way to go. Explicit "negated dereferencability" is probably an overkill if you intend it on individual resources.

I have nothing against the profiles per se but I think it just adds unnecessary complexity to the client implementation to conditionally apply different reasoning for the different profiles.

In other words, I think it's enough that hydra:Resource will be the only way to explicitly state dereferencability. All of Hydra's terms are already as well as objects of hydra:Link properties. In practice this maye leave little room for actual breaking from such a change

I was asked by @alien-mcl to weigh in.

First of all, let me mention that I have never agreed with the notion of dereferenceability being baked in into a class (such as hydra:Resource), because it is a level-breaker.
To me, usage of HTTP URLs implies dereferenceability; that also how I build REST APIs.

As such, I prefer Option 1 – Downgrade. FWIW, I don't consider this a breaking change, given that I don't think the original definition was valid/compatible with RDF/enforceable anyways.

Option 2, I consider not acceptable because also a level-breaker.

Option 3 is too complex.

Thanks @RubenVerborgh for taking a part in the discussion.

As such, I prefer Option 1 – Downgrade. FWIW, I don't consider this a breaking change, given that I don't think the original definition was valid/compatible with RDF/enforceable anyways.

As for considering what might get broken - I know TPF is close to RDF and downgrading hydra ranges for some of the terms will break reasoning process. Resources will be no more hydra:Resource, thus any implementation relying on this will get broken.

To me, usage of HTTP URLs implies dereferenceability

This is the issue - one of our community members has issues with this assumption. He uses HTTP Urls for identifying resources be for some circumstances he knows that server won't provide any of these resources. Not using hydra:Resource will somehow help as it won't impose a resource is somehow guaranteed to have a GET operation supported, but still - HTTP based Url may not be safely obtained just because it is HTTP based.

I know TPF is close to RDF and downgrading hydra ranges for some of the terms will break reasoning process. Resources will be no more hydra:Resource, thus any implementation relying on this will get broken.

The only thing that hydra:Resource offered over the REST or RDF notion of a resource (which, BTW, are the same), was a promise of dereferenceability—but it was not in the position to make such a promise. So I have never been able to rely on that anyway. It's always been just a REST/RDF resource for me.

To me, usage of HTTP URLs implies dereferenceability

This is the issue - one of our community members has issues with this assumption.

That's how Web architecture works though.

He uses HTTP Urls for identifying resources be for some circumstances he knows that server won't provide any of these resources.

The easy answer is of course "don't", and I'm speaking from a Web arch perspective here, not even RDF. But I assume there are reasons.

Not using hydra:Resource will somehow help as it won't impose a resource is somehow guaranteed to have a GET operation supported

It does not make a difference at all, and perhaps it's important that everyone in the thread understands.

hydra:Resource was never able to provide that guarantee, and here is why.

Take this example, here in JSON-LD notation:

{
  "@context": "http://www.w3.org/ns/hydra/context.jsonld",
  "@id": "http://example.org/my/resource",
  "@type": "Resource"
}

and here in Turtle:

<http://example.org/my/resource> a <http://www.w3.org/ns/hydra/core#Resource> .

What it is supposed to say according to Hydra is:

  • I can dereference the URL http://example.org/my/resource through GET

However, the above interpretation is not possible in RDF. It's not.

Because what it actually says is:

I.e., the assertion is on the node not on the identifier of that node. We cannot assert that a node is dereferenceable; only URIs can be.

What Hydra should have done is something like

<http://example.org/my/resource> ex:hasDereferenceableUri "http://example.org/my/resource"^^xsd:anyURI .

So hydra:Resource has been meaningless all along; removing that meaning does not change or break anything.

I think you are stretching the argument too thin over nit-picking on RDF nuances.

Most people will not go into the details of URI vs the actual resource. To them if the spec says "hydra:Resource are guaranteed to be dereferenced" and this will be enough. Hence the desire to write implementation accordingly.

The hasDereferenceableUri could be more kosher but more alien to non-RDF people still.

That is not to say I don't agree. I would also like to remove the ranges. The promise of dereferencability is broken anyway. Even all the hydra:Class etc which come with an API Documentation could likely not be dereferencable, despite being an implied hydra:Resource. It would be a lot of additional burden to ensure that every API's Hydra term indeed nicely dereferences. Where the default behaviour is to simply find those resources in the API Documentation resource representation.

Thus, the only resource which really does have to dereference is the API Documentation.

I think you are stretching the argument too thin over nit-picking on RDF nuances.

My point is rather that the whole hydra:Resource and rdf:Resource distinction is the nitpicking.

My point is rather that the whole hydra:Resource and rdf:Resource distinction is the nitpicking.

Yep - it returned in several discussions here and there.
Maybe there is still hope for those terms. Imagine a given Urls are in local API's domain, let's say http://some.uri/api/vocab#Class provided by the http://some.uri/api - implementation may give a promise that this very resource may be safely dereferenced.

Still, decoupling some ranges from hydra:Resource or hydra:Class (probably some class inheritance should be changed as well) sounds tempting.

Anyway, does it answer topic of this issue - document forbidden dereferencability? Maybe giving tools promising a resource can be safely dereferenced leaving all other Uris unspecified (call it on your responsibility) is enough?

I'll clarify as it seems to me my pont has been stretched beyond my assertion.

I never stated that I wanted to prevent anyone from doing an HTTP GET on a URI they find, that'd be impossible by definition. I've stated that I didn't want to make assertions that I know not to be true, even though such resources are not adressable directly in my API. You may or may not object that its good design, it's however implementations that are very common in ReSTful APIs, be it that they're RDF-centred first or not.

This is the issue - one of our community members has issues with this assumption.
That's how Web architecture works though.

We use IRI identifiers for things that sometimes can or cannot be dereferenced, and we know, as an API, that fact, most of the time. The distinction is simple: A UA would render an @id as a clickable link if it can be resolved, and would not if it cannot. Having clickable URIs that lead to nowhere is just not a good sell for an API.

By making things hydra:Resource resolvable full stop, as the spec does right now, you can't not make it clickable. So I'm stuck. I expect: a schema:Person in that operation, and it does have an ID, but it's not an information resource, I know that. No way to communicate the fact.

Worse, hash URIs cannot be dereferenced over http by definition of the protocol (removing the hash is not the same URI anymore sincew we've had a URI spec, so the UA behaviour is beyond the point), even though they may be used as identifiers, making the whole dereferenceable statement pretty strange to begin with.

See fragid conversations around https://www.w3.org/2001/tag/issues.html#abstractComponentRefs-37

So right now as per the spec i can't decide to make the distinction between things that are supposed to be resolvable and which ones are not. If the guarantee was never intended, let's remove it and introduce another mechanism by which an API can declare that the @id can be de-referenced and make that guarantee a thing. Or we jsut remove it all and unless a GET is defined as an operation, we say there is no declared operation, and we consider the issue a done deal.

The conversations that go beyond that simple scenario I can't implement with the spec as it exists without doing some unnatural rdf gymnastics, bring in another specification along or what not, is not something i feel i can add to.

The distinction is simple: A UA would render an @id as a clickable link if it can be resolved, and would not if it cannot. Having clickable URIs that lead to nowhere is just not a good sell for an API.

But why does the server generate those non-existing URIs then?

Are they placeholders for possible future resources?
If they are, then you might be looking rather for something like "created: false". (Resource state instead of identifier state.)

If they are placeholders for things that will never exist, I don't fully see the case (there might still be one, but please make it explicit).

But why does the server generate those non-existing URIs then?

Those don't have to be generated by the API in question. And quite specifically, this discussion started from hypermedia controls.

In my APIs I often define operations as such with hash URIs like http://my.app/api#SomePostOperation. Just enough for the client to find them in the API Documentation graph. But strictly speaking this is not dereferencable and I already break the implicit promise of hydra:Resource

And schema.org and many other shared vocabularies will also not play ball

I don't see a lot of issues then; just get rid of the (currently broken anyway) hydra:Resource.

Sounds like a consensus to me - I'll try to prepare a pull request regarding this issue. I'll introduce next steps soon.

@serialseb - PR with changes regarding this issue is merged and closed now. Is it ok to close this issue?

I believe so yes

Great to see that - I'm closing it.