zaggino/z-schema

Clarification on validating the URI format and the reason for strictUris mode.

k7sleeper opened this issue · 16 comments

Having a schema with $schema: 'http://json-schema.org/draft-04/schema#', setting remote schema

ZSchema.setRemoteReference('http://json-schema.org/draft-04/schema',
  fs.readFileSync (path.join __dirname, 'json-schema-draft-04', 'schema.json'), 'utf8')

and validate something against that schema, leads to

{
  code: 'FORMAT',
  message: 'uri format validation failed: schemaA',
  path: '#/id',
  params: { format: 'uri', error: 'schemaA' } 
}

If I remove $schema: 'http://json-schema.org/draft-04/schema#' from my schema validation is ok.

My schema is quite simple:

schemaA = {
    $schema: 'http://json-schema.org/draft-04/schema#',
    id: "schemaA",
    type: "object",
    properties: {
      a: {
        type: "integer"
      },
      b: {
        type: "string"
      }
    },
    required: ['a']
  };

If I change the id value from 'schemaA' to 'http://my.company.de/project#mainSchema' then validation is ok. Isn't 'schemaA' a valid URI ?

@k7sleeper look here, https://github.com/json-schema/JSON-Schema-Test-Suite/blob/develop/tests/draft4/optional/format.json , your schemaA is an invalid URI though valid URI reference according to those tests I use.

Thanks very much.
I'm sorry that I created this issue.

I'm a bit puzzeled: how does the test conform to JSON Schema: core definitions and terminology, section 7.2.1 and 7.2.2?

If you think the test is not correct, definitely open an issue at https://github.com/json-schema/JSON-Schema-Test-Suite repository. I'll fix any changes to those tests.

Also keep this issue open, I'll look into it more when I'll have time.

As soon as I understand the difference between "URI" and "URI Reference", I open an issue if justifed.

Here it says it must be valid according to RFC3986 - http://json-schema.org/latest/json-schema-validation.html#anchor123

Here http://tools.ietf.org/html/rfc3986#appendix-A you can see URI defined as:
URI = scheme ":" hier-part [ "?" query ] [ "#" fragment ]

So your schemaA is definitely not a valid URI.

But, in my opinion, we cannot ignore JSON Schema: core definitions and terminology, section 7.2.1 and 7.2.2?

As far as I see, attribute id must not be a valid uri (in the strict sence of RFC 3986), but as I said, it's only my understanding.

How can the following problem be solved: a couple of JSON files in a local directory, each JSON file contains a schema, some of the schemas refer to others by id.
Which id values should the schemas have? A fake uri like "http://my.company.com/projectName/schemaName" ? No problem for me, but my feeling is that this is not the best solution according to Draft 4.

Briefly: I gues, attribute id will be treated different from an arbitrary uri attribute.

Well, I prefer not to ignore any part of JSON Schema that I'm able to implement.
I kind of think the fake uri is better than just plain string, something in the style of java packages.
Have you tried if some://where.else/completely# will validate?

There is always an option that you can remove $schema from your schemas and it will work fine.

If you wish, I can add an custom switch so validator will skip validation against $schema.

  1. No, I'd like to indicate which specification is used. I've to share the schemas with our Java folks. We want to be as thorough as possible.
  2. No. I suggest to change nothing until things are clear. May be I get into contact with the schema specification guys. They speak about 'URI' and 'URI reference'. I'd like to understand their intentions well.

Ping @fge , @nickl- , @kriszyp , @Julian
Can any of you guys help us here? Should schemaA be an valid id in json schema according to http://json-schema.org/draft-04/schema?

That is quite a good point :).

I don't think I ever noticed that section (7.2.1) using things that are not valid URIs.

I was the one who wrote that test (in response to a bug report from someone complaining that "asdf" was passing a {"format": "uri"} schema they wrote), and I do think that URI there means URI in the RFC3986 sense (i.e. not just a URI reference).

I also think that that usage of id seems reasonable.

So, in short, I think that probably what's wrong here is the meta schema -- I think id probably was meant to be a URI reference, but that format doesn't exist in draft 4. So to me it looks like something is missing.

@fge wrote it, and probably can clarify the intentions.

@geraintluff is writing draft 5 and probably also can clarify, and especially I'm pinging him to let him know about the ambiguity here :).

I had assumed that "format": "uri" included relative URIs - if not, then I'd argue it's not very useful.

I also think the wording for v4 is not good - after all, the examples in section 7.2.2 include otherschema.json and #foo, so being absolute is clearly not a requirement.

@zaggino: After all, I'd suggest to support also relative URIs for the id attribute in z-schema without having to set an extra option, as, by now, it's a known weakness in the v4 draft.

If I'll replace URI validation with URI-reference validation as defined in RFC3986. Then your schemaA will pass as well as many other invalid things, like foobar http://example.com foobar.

But in the end, positives probably outweigh negatives for now. I'll put an option to turn on strict URIs or something.

Thanks, that's useful.