/jojodiff

Primary LanguageC++GNU General Public License v3.0GPL-3.0

JojoDiff - diff utility for binary files

Copyright © 2002-2011 Joris Heirbaut This software is hosted by: http://sourceforge.net

1. Purpose

*JDIFF * is a program that outputs the differences between two
(binary) files.
*JPTCH* can then be used to reconstruct the second file from the
first file.
For example:
  o *jdiff* archive0000.tar archive0001.tar archive0001.jdf
  o *jptch* archive0000.tar archive0001.jdf archive0001b.tar
will create a file archive0001b.tar which is identical to
archive0001.tar.

Possible applications include:
  o incremental backups,
  o synchronising files between two computers over a slow network
    (see *JSYNC*).

*JDIFF* tries to find a minimal set of differences between two files
using a heuristic algorithm with constant space and linear time
complexity. This means that accuracy is traded over speed. *JDIFF*
will therefore, in general, not always find the smallest set of
differences, but will try to be fast and will use a fixed amount of
memory.

JDIFF does not compress the generated patchfile. It is recommended
to do so with any compression tool you like. See below for an
example using ZIP.

Download these utilities from the Jojo's Binary Diff Download Page
<http://sourceforge.net/projects/jojodiff/>.


2. Version and history

The current version of this utility is bèta 0.8 dating from
September 2011. The modification history is as follows:
    v0.1 	June 2002 	Insert/delete algorithm.
    v0.2a 	June 2002 	Optimized patch files.
    v0.2b 	July 2002 	Bugfix on code-length of 252.
    v0.2c 	July 2002 	Bugfix on divide-by-zero in verbose mode.
    v0.3a 	July 2002 	Copy/insert algorithm.
    v0.4a 	September 2002 	Select "best" of multiple matches.
    v0.4b 	October 2002 	Optimize matches.
    v0.4c 	January 2003 	Rewrote selection algorithm between
                            multiple matches.
    v0.6 	April 2005 	Support files larger than 2GB.
    v0.7 	November 2009 	Optimizations for files larger than 2GB.
    v0.8 	September 2011 	Conversion to C++ classes that should be
    easier to reuse.


3. Installation

On Windows systems:
  o Compiled executables are within the "win32" directory. You can
    run them from a command prompt.

On GCC/Linux systems:
  o Compiled ELF binaries are within the "linux" directory.
  o You may also compile the source by running "make" within the
    "src" directory.
  o Copy the resulting binaries to your /usr/local/bin.
  o Within the bash directory, you can find an example BASH script,
    *JSYNC*, which I use for synchronizing files between two
    computers connected over a slow network.


4. Usage

jdiff [options] original_file new_file [output_file]

*Options:*
    -v 	Verbose (greeting, results and tips).
    -vv 	Verbose (debug info).
    -h 	Help (this text).
    -l 	List byte by byte (ascii output).
    -lr 	List groups of bytes (ascii output).
    -b 	Try to be better (using more memory).
    -f 	Try to be faster: using less memory, no out of buffer compares.
    -ff 	Try to be faster: no out of buffer compares, no prescanning.
    -m size 	Size (in kB) for look-ahead buffers (default 128).
    -bs size 	Block size (in bytes) for reading from files (default
    4096).
    -s size 	Number of samples in mega (default 8 mega samples).

*Principles:*
    *JDIFF* tries to find equal regions between two binary files
    using a heuristic hash algorithm and outputs the differences
    between both files. Heuristics are generally used for improving
    performance and memory usage, at the cost of accuracy.
    Therefore, this program may not find a minimal set of
    differences between files.
*Notes:*
  o Options -m and -s should be used after -b, -f or -ff.
  o Accuracy may be improved by increasing the number of samples.
  o Speed may be increased with option -f or -ff (lower accuracy).
  o Sample size is always lowered to the largest n-bit prime (n < 32)
  o Original and new files must be random access files.
  o Output is sent to standard output if output file is missing.
*Important:*
    Do not use jdiff directly on compressed files, such as zip,
    gzip, rar, because compression programs tend to increase the
    difference between files ! Instead use jdiff on uncompressed
    archives, such as tar, cpio or zip -0, and then compress the
    files afterwards, including the jdiff patch file. Afterwards, do
    not forget to uncompress the files before using jpatch. For
    example:
    *zip* -0 archive0000.zip mydir/* 	put mydir in an archive
    *zip* -0 archive0001.zip mydir/* 	some time later
    *jdiff* archive0000.zip archive0001.zip archive0001.jdf
    difference between archives
    *zip* -9 archive0001.jdf.zip 	send compressed difference file to
    a friend
    *zip* -9 archive0000.zip.zip archive0000.zip 	compress the
    archive before sending to a friend
    *...*
    *unzip* archive0000.zip.zip 	restore uncompressed zip file
    *unzip* archive0000.jdf.zip 	restore uncompressed jdf file
    *jpatch* archive0000.zip archive0001.jdf archive0001b.zip
    recreate archive001.zip
    *unzip* archive0001b.zip	restore mydir

    You may also replace zip -0 by tar and zip -9 by gzip, or any
    other archiving and/or compression utility you like.

jpatch [options] original_file patch_file [output_file]

*Options:*
    -v 	Verbose (greeting, results and tips).
    -vv 	Verbose (debug info).
    -vvv 	Verbose (more debug info).
    -h 	Help (this text).

*Principles:*
    *JPATCH* reapplies a diff file, generated by jdiff, to the
    original file, restoring the new file.


5. Contacts and remarks

Author: 	Joris Heirbaut
Contact me via sourceforge <http://sourceforge.net/projects/jojodiff>

If you like this program, please let me know ! If you reuse the
source code for your personal (open source) applications, it would
be great to let me know too.


    6. Acknowledgements

Earlier versions of this software have been developed within the
Cygwin/GNU environment. More recently, development has been done in
Eclipse/CDT using GCC and MinGW/GCC.