CZ-NIC/yangson

Bugs relating to json_pointer(), resource ID, instance ID

Closed this issue · 1 comments

Yangson version: 1.3.62 (on python3.7)

In some error-handling code for Yangson validation errors, we are trying to use peek to access the data at the path from the exception. To do this we are initially reinterpreting the path from the exception (exc.path) as a resource ID or instance ID directly, since there doesn't appear to be another way of converting from the path - which, according to the naming of the relevant function in the Yangson code, is a JSON pointer, and so I refer to it as such below - to either of those IDs.

The input data concerned was failing a "must" expression in the YANG schema. The JSON pointer was retrieved with exc.path where exc is the exception raised by validate():

try:
    inst = data_model.from_raw(yaml_object)
    inst.validate()
except ValidationError as exc:
    json_pointer = exc.path
    //route_from_path = data_model.parse_instance_id(json_pointer)
    // or
    route_from_path = data_model.parse_resource_id(json_pointer)
    non_validating_member = root.peek(route_from_path)

I'd like to raise the following issues I encountered during this experimentation:

  1. There should be an API call to convert that path to an XPath (instance ID) or resource ID, for use with the parse_*_id, so that you can then use peek or goto. Alternatively the path could be available as an instance ID or resource ID directly on the exception object. It is somewhat fiddly to convert between them

  2. When attempting to use the JSON pointer as an instance ID, this failed (parser threw an UnexpectedInput exception on an = character; it was expecting a / character) because the pointer contained key=value syntax for indexing into a YANG list.

  • The code in schemanode.py (function _check_must) suggests that the JSON pointer in the exception should be unexpanded - no expand_keys=True is passed to the inst.json_pointer() call on line 900. As such I would have expected to see a regular JSON pointer with a list index (e.g. .../list/4/child), rather than the expanded key=value syntax (.../list/list-key=value-for-index-4/child). Is this a bug?
    • If you had some function that converts unexpanded pointers to XPaths (essentially, turning 0-based list indices between slashes into 1-based list indices between square brackets), then you could get from unexpanded syntax to expanded by going through that function, parse_instance_id, and goto. However, it is much harder to go back from expanded syntax to unexpanded. Therefore, if there is debate as to which form should be returned by the path property, I would vote for unexpanded. But overall I'd suggest that being able to get resource or instance IDs directly would be far superior to either.
  • Technically the form with the key=value cannot be called a JSON pointer (rather a resource identifier), but since it is only called "JSON pointer" in the code and not externally, I think this is OK. That said, it would be good to have some exact documentation on what the path property of certain exceptions actually represents - this page of the docs is very sparse.
  1. When attempting to use the JSON pointer as a resource ID, this failed because the list key was a string field where the value of that string in the input data contained a comma. Yangson does not percent-encode commas in the key values when constructing the expanded JSON pointer. https://tools.ietf.org/html/rfc8040#section-3.5.3 says:

If a data node in the path expression is a YANG list node, then the
key values for the list (if any) MUST be encoded according to the
following rules:
...
The key value is specified as a string, using the canonical
representation for the YANG data type. Any reserved characters
MUST be percent-encoded, according to Sections 2.1 and 2.5 of
[RFC3986]. The comma (",") character MUST be percent-encoded if
it is present in the key value.

This caused an issue when trying to reinterpret the expanded JSON pointer using parse_resource_id as in the above code. Since the expanded JSON pointer passed to parse_resource_id contained a comma, parse_resource_id complained there were too many keys. This method worked fine when I replaced the comma in the input data with some other character.

Let me know if you'd prefer I split these points out into separate issues, but I thought it might be good to first discuss the general approach to what exc.path should return.

Thanks,

Colin Haywood
Metaswitch Networks (a Microsoft company)

Fixed by PR #97 (merged in 1.4.0)