wI2L/jsondiff

[Feature] An option to avoid generation of diff in re-ordered Slices?

iprashantjain opened this issue · 11 comments

The jsondiff library works well for us in terms of diff generations between two structs. Our problem starts to appear when the changes are made in a slice field of a struct that is they are re-ordered or any element is added/removed in the start/middle of the slice, this causes all of the elements to be shifted which is desired use-case as per the RFC-6902 and so the library generates a series of diff's. This causes problems in our use case as we don't really care about ordering and so the question:

  • Is there an option/flag to avoid the generation of several diff's in case of re-ordering or addition/removal of elements in the start/middle of a slice? or we can override this behavior of index-based comparison to content-based comparison in slices?
wI2L commented

Hello,

As you pointed it out, the lib will generate several if the items of a slice/array are reordered. I think it could reasonably be an opt-in feature actived with a functional option, similar to the existing ones. I'll look into it.

wI2L commented

@prashjai94 Would you like to work on this and send a PR ?

@wI2L
Thanks for giving me the opportunity to work on it.
I believe at this point, it would take me a lot of time to understand the code base and come up with an optimal solution.
It will be great if you can take this up :)

wI2L commented

@prashjai94
The comparison between JSON arrays is handled by compareArrays. For arrays of equal lengths and equal content, but ordered differently, using reflect.DeepEqual to check if the arrays are deeply equal is quite straightforward, and should be done before this line.

However, regarding the second part of your request:

or addition/removal of elements in the start/middle of a slice

I am still not sure to understand what you mean. If an element is added or removed "in the middle of the slice", as you say, the elements after the one being added/removed needs to be shifted to represent their new position, and it will generate N+1 operations (where N is the number of elements after the element being removed or added). If the element is added, the last operation is an add, otherwise the first is a remove, and anything else is a replace.

I see only one feature that we could implement here, an option to skip the generation of a patch serie if two arrays of equal lengths also have equal content, and nothing more.

Thanks for the comment @wI2L.

@prashjai94 The comparison between JSON arrays is handled by compareArrays. For arrays of equal lengths and equal content, but ordered differently, using reflect.DeepEqual to check if the arrays are deeply equal is quite straightforward, and should be done before this line.

I had the same thought but wasn't sure that it would be performant enough to do or not 😅
Let me gather some benchmarks for it and raise a PR on this along with other boilerplate 👍

However, regarding the second part of your request:

or addition/removal of elements in the start/middle of a slice

I am still not sure to understand what you mean. If an element is added or removed "in the middle of the slice", as you say, the elements after the one being added/removed needs to be shifted to represent their new position, and it will generate N+1 operations (where N is the number of elements after the element being removed or added). If the element is added, the last operation is an add, otherwise the first is a remove, and anything else is a replace.

I see only one feature that we could implement here, an option to skip the generation of a patch serie if two arrays of equal lengths also have equal content, and nothing more.

Agree with you!

I'll look into the contribution guidelines and raise a PR in the this week for the approved feature request.

wI2L commented

@prashjai94 Just to let you know, I have worked on several changes lately to increase the performance of the package; one of which is getting rid of the slow reflect.DeepEqual calls, which will surely make the requested feature viable performance-wise.
If you have already started on your side, please let me know, otherwise I'll include the new option we've discussed about with my changes.

wI2L commented

@prashjai94

Just pushed a new branch with the feature implemented.

Provisional benchmark results:

name                                         time/op
name                                time/op
Compare/differ_diff/default-8       5.29µs ± 1%
Compare/differ_diff/invertible-8    6.09µs ± 0%
Compare/differ_diff/factorize-8     7.64µs ± 1%
Compare/differ_diff/rationalize-8   30.9µs ± 1%
Compare/differ_diff/equivalent-8    6.69µs ± 1%
Compare/differ_diff/factor+ratio-8  30.3µs ± 1%
Compare/differ_diff/all-options-8   37.8µs ± 1%

name                                alloc/op
Compare/differ_diff/default-8       3.28kB ± 0%
Compare/differ_diff/invertible-8    5.97kB ± 0%
Compare/differ_diff/factorize-8     3.85kB ± 0%
Compare/differ_diff/rationalize-8   5.86kB ± 0%
Compare/differ_diff/equivalent-8    3.47kB ± 0%
Compare/differ_diff/factor+ratio-8  6.29kB ± 0%
Compare/differ_diff/all-options-8   7.75kB ± 0%

name                                allocs/op
Compare/differ_diff/default-8         32.0 ± 0%
Compare/differ_diff/invertible-8      33.0 ± 0%
Compare/differ_diff/factorize-8       47.0 ± 0%
Compare/differ_diff/rationalize-8     91.0 ± 0%
Compare/differ_diff/equivalent-8      36.0 ± 0%
Compare/differ_diff/factor+ratio-8     102 ± 0%
Compare/differ_diff/all-options-8      108 ± 0%

I'll add more tests/bench cases to verify how the new operation interacts with the current ones before merging.

Thanks, @wI2L!

I can see the code is merged to master, I'll try this out and reach out in case of an issue.

When do you plan to release the tag?

Appreciated your help!

wI2L commented

@prashjai94 Yeah I was working on performance improvement that i'll release with the feature under the same tag, hopefully next week. You can test with go get github.com/wI2L/jsondiff@master in the meantime.

wI2L commented

@prashjai94 Just released v0.2.0

@prashjai94 Just released v0.2.0

Thanks a lot, I have tried with master branch and it works great! I'll switch it to latest tag 🙂