[FEATURE] When a nested field has mismatched type print the full path to that nested field
abroglesc opened this issue · 5 comments
Summary
In a complex structure like the following:
{
"source_machine": {
"port": 80
},
"dest_machine": {
"port": "http-port"
}
}
If there was an error with another log where dest_machine.port
was an integer this would error and simply state something like:
Ignoring field with mismatched type: old=(hard,port,NULLABLE,STRING); new=(hard,port,NULLABLE,INTEGER)
At this point you are left to figure out which structure this port
column actually exists in. This is a more simple example but as the schema grows and is more complex, this problem is harder to manually resolve.
Ideally, we can track the path to this using a JSON path or dpath expression. Something like dest_machine.port
. This will likely take adding an additional argument to the recursive function merge_schema_entry
. Something like a base_path=None
and continually build up that base_path string in each recursive iteration so that it can be used in the errors like "{}.{}".format(base_path, new_name)
and "{}.{}".format(base_path, old_name)
I can't remember, does the script print out the line number of the record with the error? Does that help?
The JSON path to the error is a reasonable idea. I'm happy to review a PR if you have something in mind. Otherwise, it might take me a while to get to this, since it won't rise high on my priority list...
Oh, I understand your problem, you have 2 port
fields, so the line number does not help.
Fixed
Pushed v1.2 to PyPI.