/ruby_html_tree_differ

Ruby wrapper for the Python library html-tree-diff.

Primary LanguageHTMLMIT LicenseMIT

ruby_html_tree_differ

Shows the difference between two HTML documents, rendered to HTML. Relies on the html-tree-diff python module and calls it using pycall.rb.

Prerequisites

You must have python installed. You can set the python command (python or python3) with the ENV variable PYTHON. By default pycall checks for python3, then python. If your RUBY_PLATFORM is x86_x64_linux you should set the ENV variable LIBPYTHON to the output of which python or which python3. For me it was /usr/bin/python3. Check pycall's finder.rb if you have trouble.

Install

Gemfile:

gem "ruby_html_tree_differ"

Or:

gem install ruby_html_tree_differ

Example

require 'ruby_html_tree_differ'

differ = RubyHtmlTreeDiffer.new

old_doc = <<EOF
<p>Unchanged paragraph.</p>
<p>Altered paragraph.</p>
<p>Deleted paragraph.</p>
EOF

new_doc = <<EOF
<p>Unchanged paragraph.</p>
<p>Alterated paragraph.</p>
EOF

print differ.diff! old_doc, new_doc
<p>Unchanged paragraph.</p>
<p>
  <del>Altered</del>
  <ins>Alterated</ins>
   paragraph.
</p>
<del>
  <p>Deleted paragraph.</p>
</del>

Check test/test.rb for 2 benchmarks. Very long documents take a long time to process. If you are using this in a web application, you will want to calculate the diff in a background job and cache the result.