inje/google-diff-match-patch

Replacing String with StringBuffer instances in patch_apply

Opened this issue · 1 comments

We used this DiffMatchPatch code extensively for tracking differences between 
versions of database objects.  Some of our calls can generate more than 25k 
patch strings that need to be applied to large chunks of text. In this instance 
we where seeing severe performance issues and inefficient memory management 
(even on Java 6).
This patch replaces Strings with StringBuffer and implements more efficient 
INSERT & DELETE operations which drastically reduces memory usage for large 
text strings.

Original issue reported on code.google.com by chris.p....@gmail.com on 4 Oct 2011 at 4:37

Attachments:

Thanks for this.  Virtually all the performance work on DMP has been spent on 
the Diff part.  Match and Patch aren't normally resource-intensive.

The big issue with this patch is that it changes the API for DMP.  Were I to 
push this, a ton of projects (including Google Docs) would suddenly stop 
compiling and my inbox would become very active.

One option is to overload the existing functions, adding a new API.  Before I 
commit to a new API (which would have to be supported indefinitely) I want to 
create a bunch of stress tests for match and patch in each language so that 
performance can be objectively compared.

Original comment by neil.fra...@gmail.com on 4 Oct 2011 at 8:43

  • Changed state: Accepted
  • Added labels: Performance