Duplicate File Finder

Walks through a directory, adding all files to a map. Then, for every set of files with the same size, it hashes each file and lists the duplicates along with their corresponding hashes.

For a sample of 14287 files, around 3.5 G:

rmlint -r: 0m9.553s
fdupes -r: 0m14.612s
dedup: 0m11.646s

Note

Before each run, disk caches were cleared:

$ free && sync && echo 3 >| /proc/sys/vm/drop_caches && free

Building:

After cloning the repository, run:

cd dedup
make 
./dedup <dir1> <dir2> ...

Note that the program depends on OpenSSL for the hashing.

Melkor-1/dedup

Duplicate File Finder

Note

Building: