/dedup

Find Duplicate Files in a Filesystem

Primary LanguageC++MIT LicenseMIT

Duplicate File Finder

License

Walks through a directory, adding all files to a map. Then, for every set of files with the same size, it hashes each file and lists the duplicates along with their corresponding hashes.

For a sample of 14287 files, around 3.5 G:

  • rmlint -r: 0m9.553s
  • fdupes -r: 0m14.612s
  • dedup: 0m11.646s

Note

Before each run, disk caches were cleared:

$ free && sync && echo 3 >| /proc/sys/vm/drop_caches && free

Building:

After cloning the repository, run:

cd dedup
make 
./dedup <dir1> <dir2> ...

Note that the program depends on OpenSSL for the hashing.