Stranger6667/jsonschema-rs

Custom Validators

Closed this issue · 1 comments

I got a feature request in a project that uses the library to validate file paths in a document. It would make sense to simply add a validator that would be able to do this, but it's currently not possible. The closest I can get is format validators but they lack any context.

Would it be desirable to expose a way to hook into the compilation process and add entirely custom validators?

This might not be a good idea because:

  • it has nothing to do with the JSON schema spec
  • complexity and larger API surface

However it would be very convenient for my use-case.

If this is desirable I imagine in order to do this we'd need to:

  • expose the Validate trait or something similar
  • expose a way to hook into the compilation process and create validators
  • add a custom error type that could be pretty much anything (probably Any + Display)
  • add a way to pass contextual information to validators to avoid recompilations (optional, I have no idea how much complexity this would add)

I'd just like to add on to this my own use case, as additional information about how people are using the project and working around this particular issue (+ #245 ).

I use the Python binding, but I have a custom string format called "currency" which means I can't use this library for all of my validation. The "currency" format requires that a string has some amount of digit characters, followed by a period, followed by exactly two digit characters. It is meant to work around the fact that JSON does not have a fixed-point decimal type, so currency figures are stored in strings to maintain precision / avoid floating point nonsense.

I have a wrapper function around the validation that first checks whether the provided schema contains (recursively) any "currency" format:

def detect_currency_format(schema: Any) -> bool:
    if isinstance(schema, list):
        return any([detect_currency_format(x) for x in schema])
    if isinstance(schema, dict):
        if schema.get("format") == "currency":
            return True
        return any([detect_currency_format(v) for v in schema.values()])
    return False

If the schema does contain the format, I know I need to use another library to validate that entire schema. The other library I use is jsonschema. There I can add custom formats, which I do for "currency", and provide a lambda for the validation which just calls out to a regex.

Furthermore, the other library allows for overriding keyword validation in general. I use that to override default behavior for the numeric modifiers "multipleOf", "minimum", "maximum", "exclusiveMinimum", and "exclusiveMaximum" so that they apply to currency strings as if they were any other numeric type. From the perspective of designing the solution to this issue I wanted to show via my use case that not only would custom validation on new keywords be useful, but so would overriding behavior on existing keywords (to me, at least).

So, in effect, I only really use jsonschema-rs to improve performance of my type-checking where I can, which is for any schema that doesn't have a "currency" string. It would take bringing custom format validation to the Python binding ( #245 ) as well as custom validation / keyword behavior overriding (this issue) to allow me to get off my dependency on the much, much slower Python-only jsonschema library. Or convincing my team to implement this particular module in Rust, which would obviate my need for #245, but still require this issue.

Let me just make it clear, too, the performance benefits of plugging in jsonschema-rs when I can is definitely worth it :). Love this library to death, and thank you @Stranger6667 !