Update `raise_on_unknown_json_key` flag to raise a more helpful error for debugging purposes
rnag opened this issue · 0 comments
- Dataclass Wizard version: 0.21.0
Description
I want to update the UnknownJSONKey
exception that gets raised when the raise_on_unknown_json_key
flag is enabled to include a list of all the unknown JSON keys, rather than only the first such unknown key in the JSON object.
I also want to update the error raised to include a resolution message, with more developer-friendly details on the suggested dataclass fields to add to to the model, to resolve the issue. I envision this will be really helpful for others -- at least, I found myself needing such a feature when I was attempting to parse a (not well-documented) API response from a web service myself recently.
What I Did
Consider this simple, but rather contrived, example:
from __future__ import annotations
from dataclasses import dataclass
from dataclass_wizard import JSONWizard
@dataclass
class MainClass(JSONWizard):
class _(JSONWizard.Meta):
raise_on_unknown_json_key = True
my_first_field: str
data = {
'my-first-field': 'my-string',
'my-second-field': '7',
'myThirdField': [
{
'inner-field-1': '1.23',
'InnerField2': True
}
],
'my-fourth-field': '2021-12-31'
}
c = MainClass.from_dict(data) # error!
# shouldn't get this far...
print(c)
Here's the error I currently get:
dataclass_wizard.errors.UnknownJSONKey: A JSON key is missing from the dataclass schema for class `MainClass`.
unknown key: 'my-second-field'
dataclass fields: ['my_first_field']
input JSON object: {"my-first-field": "my-string", "my-second-field": "7", "myThirdField": [{"inner-field-1": "1.23", "InnerField2": true}], "my-fourth-field": "2021-12-31"}
This is of course expected behavior, since we enabled the raise_on_unknown_json_key
flag in the Meta config.
The problem here, however, is that there are multiple fields in the JSON object that are missing from the dataclass schema. It would be very helpful if we had an output that listed out all those unknown JSON keys, along with an auto-generated dataclass schema with the lines that should be added to the model, the last of which I imagine will be super helpful when designing a model that would be expected to specifically match 1:1 to an API output.
For example, this is a sample of the output I might expect:
dataclass_wizard.errors.UnknownJSONKey: There are 3 JSON keys missing from the dataclass schema for class `MainClass`.
unknown keys: ['my-second-field', 'myThirdField', 'my-fourth-field']
dataclass fields: ['my_first_field']
input JSON object: {"my-first-field": "my-string", "my-second-field": "7", "myThirdField": [{"inner-field-1": "1.23", "InnerField2": true}], "my-fourth-field": "2021-12-31"}
suggested resolution: Update the dataclass schema to add the new fields below.
@dataclass
class MainClass(JSONWizard):
...
my_second_field: int | str
my_third_field: list[MyThirdField]
my_fourth_field: date
@dataclass
class MyThirdField:
inner_field_1: float | str
inner_field2: bool
Notes
-
There's in fact a (somewhat) trivial approach. We can use a localized import from
dataclass_wizard.wizard_cli.PyCodeGenerator
to generate the desired dataclass schema. Also need to ensure to strip out the fields that already exist in the dataclass before calling this method. Something like this:unknown_keys = {k: v for k, v in json_dict.items() if normalize_key(k) not in key_to_dataclass_field} py_code = PyCodeGenerator(file_contents=json.dumps(unknown_keys), experimental=True).py_code