`min*`, `max*`, `pattern` constrained from validated JSON data
0x522D43 opened this issue · 2 comments
Hello,
Yesterday, I searched for a way to write a JSON schema that puts a constraint on the maxItem of an array based on value of another property of my object.
Use Case
I would like to set a constraint value from the value of a property of the validated data.
Example
Schema
type: object
properties:
max_size:
type: integer
minimum: 1
data:
type: array
items:
type: string
maxItems:
$refValue: '#/properties/max_size' # equals to `max_size` valueJSON to vallidate
| ✅ Valid | ⛔ Invalid |
|---|---|
|
{
"max_size": 3,
"data": ["a", "b"]
}The length of data is less or equal to |
{
"max_size": 1,
"data": ["a", "b"]
}The length of data is greater than |
My suggestion
I thought about an additional keyword $refValue (better names can be suggested 😊) that will refer to the value to use in place.
I also thought about using the already existing $ref, but this has not exactly the same purpose and behaviors: $ref will reference a part of a schema, here I need a reference to the value of the final data validated.
Scope
I think this feature can be applied to almost all Validation Keywords.
required and dependentRequired from Objects validation might be out of scope.
Schema consistency validation
To validate the consistency of a schema using this notation, we need to check that the type of the referenced target property matches the type of the constraint.
I.E.:
maxItemsmust matchtype: integerandminimum: 0, so the reference should at least match these constraints
When the reference is not available during consistency validation, the constraint is not applied, and like to $ref a warning/error can highlight the issue.
JSON validation against Schema
When $refValue is parsed by the validator, it will:
- Check if the target is available
- SUCCESS: go to the next step
- FAILURE: error message that indicates that ref is not available
- Check that the target field is valid based on it's schema constraint (in the example,
max_sizeshould be havetype: integerandminimum: 1)- SUCCESS: go to the next step
- FAILURE: error message that the referenced does not match its own constraints
- Check that the target field is valid based on constraints of the field to set (in the example,
maxItemsshould havetype: integerandminimum: 0)- SUCCESS: go to the next step
- FAILURE: error message that the referenced field constraints are not stronger or equal to the constraint of the
max*/min*... field
- Replace maxItems value with the target value (in the example, replace
{"$refValue": "#/properties/max_size"}by3)- SUCCESS: go to the next step
- Apply the validation as it is done currently
- SUCCESS: go to the next token
Security consideration
As the final schema is not known until the JSON to validate is provided, some bad data might be inputted.
This is why, as of now, I limited this suggestion to leaf-level keywords in the JSON tree with some strong constraints on them.
Be more generic?
This suggestion may be extended to a more generic implementation where $refValue will be available from anywhere like $ref. But I think this should be discussed in another topic as it implies huge thinking and security concerns.
Let me know what do you think about this.
Thank you
This use case (and the proposal) actually has a long history (see issues with the $data label).
The primary problem with this approach is validation of the schema itself.
{
"maxItems": { "$refValue": "#/properties/max_size" }
}isn't a valid schema because maxItems MUST be a positive integer.
Yeah, we could make it so that it should be a positive integer OR this reference object (similar to how OpenAPI uses $ref), but we'd need to apply that to literally everything, and and that gets messy pretty quickly.
However, this functionality can be achieved without breaking schema validity by using my data vocabulary. Here are a couple examples: https://docs.json-everything.net/schema/examples/data-ref/.
The data keyword is a separate keyword that dynamically builds a subschema that applies to the instance.
So in your case, you'd have
type: object
properties:
max_size:
type: integer
minimum: 1
data: # this is your property
type: array
items:
type: string
data: # this is my keyword
maxItems: '/properties/max_size'This dynamically builds a secondary schema with {"maxItems": <the data value (which we hope is an int)> } and then evaluates the instance against that, effectively adding the new constraints.
Because all of this resides in a keyord that validates as "an object with string values" and resolves at evaluation time, it passes schema validation. If the pointer resolves to a value that isn't acceptable to the keyword, then evaluation halts.
I haven't seen any implementations of this outside my own JsonSchema.Net, which is (probably obviously) a .Net library.
Thank you for the links and examples I will close this issue.