gazpachoking/jsonref

Support $ref replacement without the proxies

Closed this issue · 6 comments

When using a jsonref'd data structure with third-party libraries, the presence of the proxy objects in the object graph can sometimes pose compatiblity issues.

In cases where it is not easy to replace usages of json.load() and json.dump() with their jsonref counterparts, an option to produce a json data structure without proxy objects is the simplest solution.

Something equivalent to what this utility function does:

def replace_jsonref_proxies(obj):
    """
    Replace jsonref proxies in the given json obj with the proxy target.
    Updates are made in place. This removes compatibility problems with 3rd
    party libraries that can't handle jsonref proxy objects.
    :param obj: json like object
    :type obj: int, bool, string, float, list, dict, etc
    """
    # TODO: consider upstreaming in the jsonref library as a util method
    def descend(fragment):
        if is_dict_like(fragment):
            for k, v in iteritems(fragment):
                if isinstance(v, jsonref.JsonRef):
                    fragment[k] = v.__subject__
                descend(fragment[k])
        elif is_list_like(fragment):
            for element in fragment:
                descend(element)

    descend(obj)

Would a PR with this functionality be considered for merging?

This issue is pretty stale, but I just want to first say thanks because this was really helpful and second to say that I think it isn't necessary. I found that using deepcopy does the same thing and is built-in:

In [1]: from copy import deepcopy

In [2]: import json

In [3]: import jsonref

In [4]: json_str = """{"real": [1, 2, 3, 4], "ref": {"$ref": "#/real"}}"""

In [5]: data = jsonref.loads(json_str)

In [6]: data['ref']
Out[6]: [1, 2, 3, 4]

In [7]: type(data['ref'])
Out[7]: jsonref.JsonRef

In [8]: deref = deepcopy(data)

In [9]: type(deref['ref'])
Out[9]: list

In [10]: deref['ref']
Out[10]: [1, 2, 3, 4]

In [11]: json.dumps(deref)
Out[11]: '{"real": [1, 2, 3, 4], "ref": [1, 2, 3, 4]}'

Granted, it doesn't replace in-place - which can be an issue for large datasets, but for smaller applications this will do the trick.

@analogue's code worked well for us (Thanks!), but with a few modifications:

import collections

def _replace_jsonref_proxies(obj):
    """
    Replace jsonref proxies in the given json obj with the proxy target.
    Updates are made in place. This removes compatibility problems with 3rd
    party libraries that can't handle jsonref proxy objects.
    :param obj: json like object
    :type obj: int, bool, string, float, list, dict, etc
    """
    # TODO: consider upstreaming in the jsonref library as a util method
    def descend(fragment):
        if isinstance(fragment, collections.MutableMapping):
            for k, v in fragment.items():
                if isinstance(v, JsonRef):
                    fragment[k] = v.__subject__
                descend(fragment[k])
        if isinstance(fragment, collections.MutableSequence):
            for i, element in enumerate(fragment):
                if isinstance(element, JsonRef):
                    fragment[i] = element.__subject__
                descend(element)

    descend(obj)

The deepcopy was something we preferred to avoid.

Done, in v1.0.0

Curiously, I've been dumping JSON Schema dereferenced with jsonref using json.dump without issue. Was this issue related to non-standard-library serializers?

Curiously, I've been dumping JSON Schema dereferenced with jsonref using json.dump without issue. Was this issue related to non-standard-library serializers?

Hmm, really?

import json

import jsonref

f = {
    "foo": {"$ref": "#/definitions/bar"},
    "bar": "blah"
}
data = jsonref.replace_refs(f)
print(json.dumps(data))
Traceback (most recent call last):
  File "C:\Users\chase\AppData\Roaming\JetBrains\PyCharm2022.2\scratches\scratch.py", line 10, in <module>
    print(json.dumps(data))
  File "C:\Python310\lib\json\__init__.py", line 231, in dumps
    return _default_encoder.encode(obj)
  File "C:\Python310\lib\json\encoder.py", line 199, in encode
    chunks = self.iterencode(o, _one_shot=True)
  File "C:\Python310\lib\json\encoder.py", line 257, in iterencode
    return _iterencode(o, 0)
  File "C:\Python310\lib\json\encoder.py", line 179, in default
    raise TypeError(f'Object of type {o.__class__.__name__} '
TypeError: Object of type str is not JSON serializable

(It's really talking about the JsonRef object here, not str. But the proxy passes the class name through.)

(That code needs to be #/bar.) Ah, looks like I was using jsonref.dump in one case, and in the other case I was actually never dumping it in the case that I replaced refs.