diff_cleanupSemantic sometimes produces much worse results (visually)
Opened this issue · 1 comments
GoogleCodeExporter commented
What steps will reproduce the problem?
Call diff_cleanupSemantic with the results of diff_main.
What is the expected output? What do you see instead?
I expect the output to be at least as visually ‘clean’ as with no
diff_cleanupSemantic call.
Sometimes the output is much worse after the diff_cleanupSemantic call.
What version of the product are you using? On what operating system?
JavaScript version, r103.
Please provide any additional information below.
I’ve been using DMP for a few years now to display diffs to text files before
svn commits. (Thank you.)
I sometimes see results that don’t look as I would expect (or like).
I’ve finally tracked it down to a simple example.
I always call diff_cleanupSemantic after diff_main as the final output is for
user display.
The outputs that result from this appear to vary
unexpectedly/uncontrollably/randomly.
The following small example produces a poor display after diff_cleanupSemantic,
but a slightly better display if each tab is replaced with 2 spaces, better
still with 6 spaces. However, if you replace all occurrences of 'comment' with
'ABC' then the output is barely changed after the diff_cleanupSemantic calls in
each case.
T1='BEGIN\n\tUInt16\tflaps;\n\taabb\tfat_mood:1;\n\txxyy\tisthing:1;\n\tccdd\ttr
ee_climb:1;\n\tUInt8\tflick:5;\nEND';
T2='BEGIN\n\tUInt16\tflaps;\t\t// first comment\n\taabb\tfat_mood:1;\t\t//
second comment\n\txxyy\tisthing:1;'+
'\t// third comment\n\tccdd\ttree_climb:1;\t// forth comment\n\tUInt8\tflick:3;\t// fifth comment\nEND';
diffs=DMP.diff_main(T1, T2)
diffs =>
BEGIN¬
UInt16 flaps;<ins> // first comment</ins>¬
aabb fat_mood:1;<ins> // second comment</ins>¬
xxyy isthing:1;<ins> // thrid comment</ins>¬
ccdd tree_climb:1;<ins> // forth comment</ins>¬
UInt8 flick:<del>5</del><ins>3</ins>;<ins> // fifth comment</ins>¬
END
diff_cleanupSemantic(diffs) =>
BEGIN¬
UInt16 flaps;<del>¬
aabb fat_mood:1;¬
xxyy isthing:1;¬
ccdd tree_climb:1;¬
UInt8 flick:5;</del><ins> // first comment¬
aabb fat_mood:1; // second comment¬
xxyy isthing:1; // thrid comment¬
ccdd tree_climb:1; // forth comment¬
UInt8 flick:3; // fifth comment</ins>¬
END
While the second has less elements than the first it cannot be said to meet the
intent to “Increase human readability by factoring out commonalities which
are likely to be coincidental”.
The attached HTML file, when dropped in the diff_match_patch/javascript folder
& opened in a browser, fully illustrates the problem.
Original issue reported on code.google.com by chris...@gmail.com
on 30 Mar 2012 at 2:46
Attachments:
GoogleCodeExporter commented
I've seen the same problem, often if you have taken single line function call,
and broken it across multiple lines for a change, the raw output always only
shows insertions of newlines, but after semantic cleanup we have larger
deletions and insertions, and what it determines to be part of the those
deletions and insertions is quite variable depending on the level of
indentation/spaces, as noted in the bug above.
For example:
Before Text:
EXPECT_SUCCESS(SM::getMV(noSubject, S::Builder().set_values(values).finalize(), dummy, results));
After Text:
EXPECT_SUCCESS(SM::getRMV(noSubject,
S::Builder().set_values(values).finalize(),
dummy,
results));
Diff and then semantic cleanup:
EXPECT_SUCCESS(SM::getRMV(noSubject,<ins>¶</ins>
S::Builder().set_values(values).finalize(), <del>dummy,</del><ins>¶
dummy,¶</ins>
results));
Original comment by d...@twkie.net
on 27 Nov 2012 at 11:57