
Find duplicate files according to md5 hash

Primary LanguageGoGNU Lesser General Public License v3.0LGPL-3.0


Find duplicate files according to md5 hash

dupfinder takes absolute file paths on stdin, each separated with a newline. It calculates the md5 checksum for each file. After EOF is read, the summary is printed to stdout. The summary consists of a line with the md5 checksum followed by the filenames according to the checksum.


dupfinder takes some options

        just output the checksums of all files (default 'false')
-w int
    	count of parallel md5sum workers (default 4)



browse all files and subdirectories within /var/tmp and search for duplicate files. Save duplicates in a text file. Use 10 md5sum workers

$ find /var/tmp/ -printf "%p\n" | dupfinder -w 10 > duplicates.txt

browse all files and subdirectories within /var/tmp and print the checksums of all files. Like the unix tool 'md5sum', but use 10 parallel workers.

$ find /var/tmp/ -printf "%p\n" | dupfinder -w 10 -sumsOnly


browse all files and subdirectories within C:\tmp and search for duplicate files. Save duplicates in a text file

> dir C:\tmp /S /B" | dupfinder.exe > duplicates.txt


$ find /var/tmp/ -printf "%p\n" | dupfinder
2015/11/22 20:07:38 Worker count 4
Checksum d41d8cd98f00b204e9800998ecf8427e:
$ find /var/tmp/ -printf "%p\n" | dupfinder -sumsOnly
2016/01/27 18:24:04 Worker count 4
8eea72e38a8c03d1932cb505a22c69c7  /var/tmp/file4
5c219e4eef807cb8485e4795fa2ecd1b  /var/tmp/file1
518dea85c42eb48d0db9a5486f9351cc  /var/tmp/file2
38de3a8ad093febed8d7a2e63cdaae37  /var/tmp/file3