andreyvit/json-diff

Incorrect diff reported between identical arrays of mixed strings and objects

luckygerbils opened this issue · 4 comments

Creating a single file consisting of a JSON array with three entries: a string, an object, then a duplicate of that object and then diffing that file with itself reports differences (despite diffing the file with itself, which should be identical).

$ echo '[ "a", { "foo": "bar" }, { "foo": "bar" } ]' >array.json
$ json-diff --full array.json array.json
 [
   "a"
-  {
-    foo: "bar"
-  }
   {
     foo: "bar"
   }
+  {
+    foo: "bar"
+  }
 ]

Removing the initial string entry correctly reports no diff, so that's somehow a trigger for the bug. I'm guessing something is going wrong with the diff trying to identify moved array elements (which it should probably only be doing after it's detected that array elements aren't structurally equal?).

Tested locally with version 0.7.1 using Node v12.22.10.

Runkit demonstrating issue directly with NodeJS API: https://runkit.com/embed/gpzkiqddw0e0

I think I've identified what the problem is. This is a hacky fix/workaround, but probably not the right solution.

The issue is that, when scalarizing the first array, the original index of each object in the array is tracked only incidentally via the object entry position in fuzzyOriginals. But since the initial "a" element is a scalar already, it isn't put into fuzzyOriginals and therefore the entry position of every subsequent object vs it's actual original index is off by one.

My hacky fix just adds a placeholder for any actual scalar values to fuzzyOriginals so that the indexes match up. This appears to fix the issue, at least in the test case I added.

This doesn't feel like the best solution though. It seems like it would be better to explicitly remember the original index of each scalarized value by storing something like originals[key] = { item, index } instead of originals[key] = item. This requires changes to the descalarizing code though, which I haven't looked at yet. I can do that and send a pull request if my interpretation of this issue sounds correct to you.

I sent a pull request here that I think is a better solution to this than what I originally posted here. Let me know what you think.

tuct commented

We have the exact same issue,
+1 for merging the PR soon and thanks a lot to everyone involved!