Possible improvements on not supported & invalid regexes
Stranger6667 opened this issue · 3 comments
In web APIs, users often use regular expressions syntax supported by their backend, and sometimes it is not compatible (in some areas) with the one supported by JSON Schema.
For example, this AWS API uses character classes that are supported by Java, for example, \p{Alpha}
. It is not supported in Python stdlib re
module, and currently hypothesis-jsonschema
uses st.nothing()
for such cases. In the simplest case, it leads to Unsatisfiable
as there are no values in this strategy.
But consider an array:
{
"items": {
"pattern": "\p{Alpha}",
"type": "string",
},
"maxItems": 50,
"minItems": 0,
"type": "array",
}
Even though we don't support generating strings for such regular expressions in the schema above, we still can generate an empty array that will match the schema. The same could be applied to optional properties, etc.
The current error output for the schema above:
hypothesis.errors.InvalidArgument: Cannot create a collection of max_size=50, because no elements can be drawn from the element strategy nothing()
From the user perspective, it will be nice to expose some information about why it happens (the unsupported regex syntax). For cases when we still can generate data without those items/properties, it might be a warning, and for cases when we can't, a better error message will be great (e.g., if there is minItems: 1
)
What do you think?
P.S. I am pretty sure that I saw a different InvalidArgument
error that was also connected to drawing from nothing()
- I will post an update once I find it
I think this is two separate issues:
- Emit a warning when we encounter a regex pattern which is not valid in Python, and
- Work around Hypothesis' usual check that list elements is non-empty if max_size is set
Both are fixable, of course 🙂
Amazing! Thank you! :)