inveniosoftware/dictdiffer

Wrong patch format WRT the standard RFC 6902

Opened this issue · 5 comments

This is the test that I run

self.root= {}
self.head= {'foo': 'baz1'}
self.update= {'foo': 'baz2'}

non_list_merger = Merger(self.root, self.head, self.update, {})
try:
    non_list_merger.run()
except UnresolvedConflictsException as e:
    print(e.content)

This is the result of the previous code:

[Conflict(('add', '', [('foo', 'baz1')]), ('add', '', [('foo', 'baz2')]))]

In according to the standard RFC 6902 that defines: a JSON document structure for expressing sequence of operations to apply to a JavaScript Object Notation (JSON) document; (https://tools.ietf.org/html/rfc6902)

IMHO the result is wrong for 2 reasons:

  1. It is not wrapped in an object (but is a minor issue, tuples are fine)

  2. The format of the response is wrong because is not returning the key foo in the right place. This force users to handle different cases and build manually the path for a given patch.

Thanks for the suggestions. I don't think dictdiffer was ever meant to generate patches according to JSONPatch, and dictdiffer also diffs more than just JSON.

You could add a new feature to support output in JSONPatch, but not sure how much work is involved with that.

Is the current patch format documented in any other way except in the tests? I am thinking of some kind of a grammar that we adhere to.

I agree that the docstring contains a number of excellent examples, but it still leaves it to the reader to infer what the rules are, especially as it is all plain values with no keys.

I am thinking of something simple but explicit like this:

  • diff is a generator that yields zero or more tuples of the format (op, path, values), where
  • op is one of the strings 'add', 'change' or 'remove'
  • path is by default a dot-separated string of keys from the root of the structure to the point of difference.
    • If parameter dot_notation is set to False, path is a list of separate key strings instead.
  • values are one or more tuples containing:
    • Key/value pairs for 'add' and 'remove'; keys for lists are indexes
    • Previous/new values for 'change', key being a part of the path in this case
    • Several value tuples sharing the same op and path are wrapped in a list, unless you specify expand=True, in which case they all get a separate (op, path, values) tuple.

If we can make sure that the above is correct and complete, I can PR it to the docstring.

And for some reason, I really do not "get" path_limit. A description would add value there as well.

@mikaelho the PR with a description from your comment is welcome 🙏