eclipsesource/play-json-schema-validator

Problem resolving relative references from jar resources

Closed this issue · 18 comments

I know that there have been a few answered posts regarding relative references, but my issue is a bit different in its requirements.

Say I've got the following schema files:
resources/schemas/my_schema.schema:

{
  "properties": {
    "foo": {
      "$ref": "foo.schema#"
    }
  }
}

resources/schemas/foo.schema:

{
  "properties": {
    "bar": {
      "type": "string"
    }
  }
}

I load the top-level schema file by using this in my code:
getClass.getResource("/schemas/my_schema.schema") to retrieve a URL.

  • In my tests, that URL contains the protocol file: and the URL becomes file:/absolute/path/to/schemas/my_schema.schema and relative references work fine. Note that this is not jar-relative - it's just a file system path (assuming because it's running via sbt)
  • In my production environment, the jar is deployed and run via java /absolute/path/to/my_jar.jar. Then the protocol for the top-level schema becomes jar:file: and the URL becomesjar:file:/absolute/path/to/my_jar.jar!/schemas/my_schema.schema (note the !) and relative references do not work ("Could not resolve ref foo.schema#")

Using version 0.8.1

I've seen issue #48 regarding relative resources inside jars, however, I don't want to have to use the ClasspathUrlResolver and prefix my references with classpath: (as in "$ref": classpath:foo.schema) (By the way, please correct me if my interpretation of #48 is mistaken).

Is there some way that I can create a resolver that doesn't require a protocol, but does the same as ClasspathUrlResolver regarding relative references? For instance, could I create a resolver with a protocol of ""?

Thanks for any help you can provide.

A quick update: The problem with even creating a "protocol-less" classpath resolver is that it will break my tests. The tests run using file system paths, and the production application creates classpath-relative paths. It seems there is no relative resolver that can handle both of these situations simultaneously?

Hmm... that last comment may not be true, as the ClasspathUrlResolver simply uses getClass.getResource just like I do to load my top-level schema. That method seems to resolve correctly whether running via sbt or java command.

I'll do some tests and see what comes of it. I'm still interested in a protocol-less custom resolver if that's possible though, so I don't have to modify my schemas.

It seems I was mistaken about the use of ClasspathUrlResolver as in my testing it doesn't seem to work for relative classpath paths. That is, inside my_schema.schema:
"$ref": "classpath:foo.schema#" will not work, but
"$ref": "classpath:/schemas/foo.schema#" will.

The optimal solution would allow resolving relative sub-schemas without the need for a protocol, as in my example schemas at the top of this issue. Thanks for any help!

Hi, I busy this weekend, but 'll check next week whether I can work out a solution for this. Maybe via a custom option being set on the validator which tries to resolve relative references via a custom protocol as well. I'll let you know once I have something.

Thanks. Like I said, an option that doesn't require a protocol to be inserted into the schemas would be optimal. We use other tools which resolve the schemas relatively and they likely won't recognize the paths if there were a custom protocol prefixing them. Looking forward to see what you come up with.

@andrewmeyerBR I've updated the validator to 0.8.2 including your proposed change (as well as a couple of other fixes). Please have a look at the respecive test case and the shouldResolveRelativeRefsWithCustomProtocols method and let me know if this works for you. The option will cause the validator to try resolving refs with the registered custom protocols in case they couldn't be resolved succesfully.

Hi Edgar,

Thanks for the update. How can I see the contents of “/my-schema.schema”? Searching the repo for this file didn’t return anything. I’d like to see what the paths look like.

A question: do you mean something more like: “resolve relative references with custom resolvers”? (vs “protocols”). The way our conversation went before, there should be no protocol needed in the schema ref. Is this simply enabling relative references for all resolvers/protocols? If so, I would suggest renaming shouldResolveRelativeRefsWithCustomProtocols with something more like supportsRelativeRefs.

A couple of other comments:

  • Is there a way to create a "protocol-less" class path resolver that overrides the default resolution mechanism (seems by default it is filesystem path resolver which supports relative refs). I'd simply like to replace the default resolver with one that searches the classpath instead, without the need to add "classpath:", nor the full paths (just relative to top-level schema classpath).
  • I noticed the classpath resolver is in an internal package. Is this intentional?

The my-schema.schema file as well as the foo.schema file are part of the issue-65.jar which lies on the classpath in order to mimic the requested behaviour (also see Build.scala). So you can look for issue-65.jar and extract the contents in order to have a look at the files.

Yes, renaming the test to “resolve relative references with custom resolvers" probably makes more sense (I will do that), but I'm unsure about supportsRelativeRefs, since the validator does support relative refs by default (but not via custom resolvers). I think the method name should give users a hint that setting this option involves custom resolvers.

Regarding your other comments:

  • The current problem is, that currently there is no such thing as a file-system resolver. But I guess it would make sense to extract that behavior into a resolver of its own, which might involve a bit of refactoring. That said, you shouldn't need to specify "classpath:" nor the full path if you have registered the classpath resolver and have the shouldResolveRelativeRefsWithCustomProtocols option set.
  • No, seems like I forgot to externalize it. I'll change this for 0.8.3. Thanks for the hint!

I see. I'm still a little confused regarding the new usage. Does the classpath resolver no longer require the "classpath" protocol be prefixed on the ref paths? Even without setting shouldResolveRelativeRefsWithCustomProtocols.

Perhaps I can clarify this by explaining the inner workings: an UrlResolver is just a URLStreamHandler with an additional protocol field. With that protocol field a resolver can indicate for which protocol it is responsible, hence we have a mapping from protocol to resolver. If a ref is encountered with a custom protocol, the validator will check whether there is any custom resolver registered for that specific protocol and will use it in order to resolve the ref. So far, so good.

With 0.8.2 the custom resolvers are queried a second time, that is, in case the current (relative) ref couldn't be resolved at all so far. In that case, the mapping of custom resolvers is iterated over and for each entry we prepend the protocol the resolver is responsible for to the value of the ref. For instance, in the initial example we have the relative ref foo.schema#". The validator will not be able to resolve this ref since no such thing exists in the current resolution scope. It will iterate over the registered custom resolvers and find the classpath resolver. Resolution is now retried with the custom resolver and the ref being set to classpath://foo.schema#. If resolution should fail (which doesn't in this case) we try the next custom resolver and so on until we either succeed or fail.

Hope I could clarify this a bit. If there are any other questions/comments, let me know.

Thanks for that explanation. It makes perfect sense. I would, however give a word of caution as to the coupling of the protocol-specific handlers with the relative resolution. Here's why:
For an instance using ClasspathUrlResolver:

  • Without shouldResolveRelativeRefsWithCustomProtocols, the user of the library needs to have "classpath:" prefixed in their schema ref paths.
  • With shouldResolveRelativeRefsWithCustomProtocols enabled, the user would not need to prefix "classpath:" in their ref paths.

This can be confusing because the "classpath:" should be required regardless of shouldResolveRelativeRefsWithCustomProtocols because there is a ClasspathUrlResolver.

A suggestion:

  • Decouple the logic of the two, providing a method addRelativeUrlHandler(handler: URLStreamHandler) instead of shouldResolveRelativeRefsWithCustomProtocols.
  • Then you can create a class analogous to ClasspathUrlResolver that is simply a URLStreamHandler (because it doesn't need a protocol) called ClasspathHandler and uses getClass.getResource internally (in fact, you could just make ClasspathUrlResolver implement ClasspathHandler with the UrlResolver trait mixed in).

In summary, you'd have the following:

  • addUrlResolver(urlResolver: UrlResolver)
  • addRelativeUrlHandler(handler: URLStreamHandler)

Then, the user could choose whether or not to require a protocol in the relative schema ref paths:

  • SchemaValidator().addRelativeUrlHandler(ClasspathHandler()): "classpath:" not required on relative schema ref paths
  • SchemaValidator().addRelativeUrlHandler(ClasspathUrlResolver()): "classpath:" is required on relative schema ref paths

For the case when there is both a set of protocol resolvers AND relative handlers, try the resolvers in order first, then the relative handlers in order if the resolvers all fail.

I would also suggest renaming UrlResolver to something like UrlProtocolHandler to be more consistent with the protocol-less variant (URLStreamHandler)

I like your suggestions, thanks for the input. I'll update and release 0.8.3 until the end of the week.

@andrewmeyerBR I've implemented your suggestions. Do you mind having a look at the PR? Looking at the updated spec is probably already enough. If it fits your needs, I'll create 0.8.3 this weekend.

Hi @edgarmueller, looks really good. I made a few comments in your diff.

I've implemented most of your suggestions and released 0.8.3, hence I'm closing this for now. Please note my reply on your comments of the merged PR. If there's anything else you need, let me know and re-open this issue or create a new one. Thanks!