Confusing validation error for Item
philvarner opened this issue · 13 comments
Running against the attached JSON file, I get the confusing error message below.
$ stac-validator item.json
[
{
"version": "1.0.0",
"path": "item.json",
"schema": [
"https://schemas.stacspec.org/v1.0.0/item-spec/json-schema/item.json"
],
"valid_stac": false,
"error_type": "ValidationError",
"error_message": "'collection1' should not be valid under {}. Error is in collection"
}
]
Does any of this make sense? From the schema: https://schemas.stacspec.org/v1.0.0/item-spec/json-schema/item.json
"properties": {
"links": {
"contains": {
"required": [
"rel"
],
"properties": {
"rel": {
"const": "collection"
}
}
}
}
}
},
"then": {
"required": [
"collection"
],
"properties": {
"collection": {
"title": "Collection ID",
"description": "The ID of the STAC Collection this Item references to.",
"type": "string",
"minLength": 1
}
}
},
"else": {
"properties": {
"collection": {
"not": {}
}
}
}
If you add something like this to your links it will pass - I guess just a collection link
{
"rel": "collection",
"href": "./collection.json",
"type": "application/json",
"title": "Simple Example Collection"
}
or you can just move the collection field to properties instead I think
There definitely could be better error messaging - we are just catching the error messages from jsonschema right now ...
Can we add messages to the schema itself?
@philvarner @gadomski would something like this help? Notice the help message at the end. We would need a lot of if statements to better explain these types of errors.
{
"version": "1.0.0",
"path": "phil.json",
"schema": [
"https://schemas.stacspec.org/v1.0.0/item-spec/json-schema/item.json"
],
"valid_stac": false,
"error_type": "ValidationError",
"error_message": "'collection1' should not be valid under {}. Error is in collection",
"help": "If the error message doesn't make sense, refer to the schema"
}
To me,
Error is in collection
could be made more clear. I tend to find JsonSchema errors hard to understand, but to your point @jonhealy1 correctly re-wording all possible cases is probably out of scope. One alternative would be to fall back on JsonSchema's own string representation instead, e.g. change
stac-validator/stac_validator/validate.py
Lines 307 to 310 in bb1cbd6
err_msg = str(e)
In this case, the validator output would look like:
[
{
"version": "1.0.0",
"path": "item.txt",
"schema": [
"https://schemas.stacspec.org/v1.0.0/item-spec/json-schema/item.json"
],
"valid_stac": false,
"error_type": "ValidationError",
"error_message": "'collection1' should not be valid under {}\n\nFailed validating 'not' in schema['allOf'][0]['allOf'][2]['else']['properties']['collection']:\n {'not': {}}\n\nOn instance['collection']:\n 'collection1'"
}
]
Not sure if that's better?
Haha it's not really any better and it's a little messy. With this type of error you really need to try to read the schema. We could try and catch this one circumstance. It is a little confusing what the rules are with where to put 'collection'.
It's something that should maybe be added to stac-check
Yeah, IMO a custom validator like this (or stac-check) does have a space to catch common problems that are hard to understand from jsonschema; this case (missing the collection link and/or collection attribute) is pretty common and really hard to understand from jsonschema errors, so might warrant a special check.
Printing the json schema validation error is a huge help, because then at least I know where to go look in the schema for the problem, even if that message is still unclear. I don't think there's any general way to map these errors, but... it may be useful to handle this case -- I'm pretty knowledgable about STAC, and I didn't realize the schema actually required that there be a collection link if you define the collection field. (Though I kind of disagree with that constraint -- why shouldn't I be allowed to set the collection field value without a link to a collection?) At least the JSON error points you to where to look, since the current error message is useless.
I didn't know it was an issue either. If you've never worked with a json schema before, the second message isn't very helpful either. I think a help message advising someone to look at the schema is helpful.
I do think it's better to print out the full message like Pete did.