thoth-org/Thoth.Json

Turning DecoderError into user friendly ValidationErrors

Opened this issue · 2 comments

I'd like decoder errors to be able to work with custom validation errors. Many of my types follow the "always valid" pattern, where types cannot be constructed without valid data.

As an example, given the following JSON, which has some type and value issues:

{
	"number": "a",
	"audit": {
		"createdBy": null,
		"date": "2021-07-66"
	}
}

I would like to generate some errors like the following:

[
	{ field = "number"; message = "The value 'a' was not a valid number" }
	{ field = "audit.createdBy"; message = "The field 'createdBy' is required" }
	{ field = "audit.date"; message = "The value '2021-07-66' was not a valid date" }
]

However, it does not appear to be possible, I cannot access a hierarchy of errors form a Decoder Error. The type seems to represent only a single error. The signature is as follows:

type DecoderError = string * ErrorReason

I was expecting the Decoder Error to be a Generic Data Structure which would be able to represent arbitrary errors passed up from a decoder all the fields, something like:

type DecoderError = DecoderItemError[]
type DecoderItemError = field: string * reason: ErrorReason * children: DecoderItemError[]

Is it possible to do something like this with Thoth? Or is the error always a string?

The docs point to this as an example of a decoder error:

Error at: `$.user.firstname`
Expecting an object with path `user.firstname` but instead got:
{
    "user": {
        "name": "maxime",
        "age": 25
    }
}
Node `firstname` is unknown.

I'm looking for an approach to achieve these validation errors using Thoth, thanks.

Hello @daniellittledev,

The docs point to this as an example of a decoder error:

Error at: `$.user.firstname`
Expecting an object with path `user.firstname` but instead got:
{
    "user": {
        "name": "maxime",
        "age": 25
    }
}
Node `firstname` is unknown.

This is indeed the kind of error that Thoth.Json produces except for the Node firstname is unknown. part. This is easier a left over from a previous version or an error on my side when writing the documentation.

Live demo

Is it possible to do something like this with Thoth? Or is the error always a string?

Right now, it is not possible to produce the kind of error you mention with Thoth.Json.

Current implementation of Thoth.Json is stopping after the first error it encounters because at the time of writing I though if the JSON is not valid no need to go further. And also, it makes the implementation much easier 😇

There are special case where it can report several errors but it is limited to Decode.object and Decode.oneOf decoders and will never go deeper than the current level of the error.

Example:

Live demo

let json =
    """
    {
        "user": {
            "name": "Maxime"
        }
    }
    """

type User =
    {
        Firstname : string
        Age : int
    }

module User =

    let decoder : Decoder<User> =
        Decode.object (fun get ->
            {
                Firstname = get.Required.Field "firstname" Decode.string
                Age = get.Required.Field "age" Decode.int
            }
        )

type MyJson =
    {
        User : User
    }

module MyJson =

    let decoder : Decoder<MyJson> =
        Decode.object (fun get ->
            {
                User = get.Required.Field "user" User.decoder
            }
        )

match Decode.fromString MyJson.decoder json with
| Ok _ ->
    JS.console.log ("Valid json")

| Error errorMessage ->
    JS.console.error errorMessage

Generates:

I run into the following problems:

Error at: `$.user`
Expecting an object with a field named `firstname` but instead got:
{
    "name": "Maxime"
}

Error at: `$.user`
Expecting an object with a field named `age` but instead got:
{
    "name": "Maxime"
}

Adding support for this kind of feature in Thoth.Json should be possible but will need a complete rewrite of Thoth.Json internal and it will be complex to do. Indeed, we will have to detect when the decoder can or cannot go down further. For example, if an a property is missing we can't (don't want?) try to decode it down further because it will just generate errors.

What you are asking meet a need I have/discover when working on another project. In another project, I am using Thoth.Json to validate a configuration file and so currently the reported errors are not so great for that case. So I would like to be able to customize the error message of Thoth.Json to be able to generate "better" error message for this case.

I already had a complete rewrite of Thoth.Json planned, so perhaps it will be a good opportunity to explore the addition of this feature.

I am unsure if I want the "new error" report to be the only one or keep the current behavior too. The benefit of the current behavior is that it will be more performant than the new one because it will stop at the first error encountered. Saving a few computation. But Thoth.Json isn't really about performance and in theory never will be able to compete with native JSON library or stuff like Newtonsoft.Net because it adds an overhead on top of the JSON parsing.

On the other side, reporting all the errors at once, can also improve the developer experience. Indeed currently, if there are several errors in the JSON he will discover them one by one instead of being able to fix everything in one go.

To summarize, with the current version of Thoth.Json it is not possible to have all the errors reported at once but I think it could be a nice addition for a future version of Thoth.Json.

Thanks, it's good to know that this scenario isn't supported at the moment.

In regards to supporting the short circuit error handling mode, I would vote to go without it. Performance* would be the only use case for it and the developer experience and validation functionality would be much more important. I believe the majority of JSON parsing libraries only support multiple errors too.

  • Performance of error handling does seem very niche, and honestly if you want that much perf you might be better off with a binary format