Harness the power of MD5 checksums for your filesystem with our toolkit of MD5 filesystem tools designed for managing and analyzing file integrity across nested directories. Features include recursive checksum file creation, duplicate file detection based on checksum comparison, and automated duplicate removal using checksum validation.
This script is designed to recursively find all directories within a specified root directory, calculate the number of files in each directory (excluding subdirectories and any 'checksums.md5' files), and generate a 'checksums.md5' file containing the MD5 checksums for each file within the directory. It also provides feedback on its progress and the results of its operations.
This command will process the 'Photos' directory, showing the progress and results for each subdirectory found:
./generate_checksums_recursively.sh ~/Photos
This script is designed to identify and list duplicate files based on their MD5 checksums. It requires a file named checksums.md5 (generated by generate_checksums_recursively.sh or simply by md5sum) that contains a list of MD5 checksums and their corresponding files. The script will process this checksum file and group the files by their checksums to identify duplicates.
To check for duplicates using a file checksums.md5 in the 'Photos' directory:
md5sum ~/Photos/* > checksums.md5
./find_duplicates.sh ~/Photos
Example output:
26ab0db90d72e28ad0ba1e22ee510510:
/home/user/Photos/Toronto2024.jpg
/home/user/Photos/Toronto2024 (1).jpg
b026324c6904b2a9cb4b88d6d61c81d1:
/home/user/Photos/Paris2023.jpg
/home/user/Photos/Paris2023 (1).jpg
This script is designed to identify and remove duplicate files within a specified directory, keeping only the file with the shortest name in each set of duplicates. It utilizes a checksums.md5 file (generated by generate_checksums_recursively.sh or simply by md5sum), which should contain the MD5 hashes of the files in the directory, to identify duplicates.
To remove duplicates using a file checksums.md5 in the 'Photos' directory:
md5sum ~/Photos/* > checksums.md5
./remove_duplicates.sh ~/Photos
Example output:
/home/user/Photos/Toronto2024.jpg
/home/user/Photos/Toronto2024 (1).jpg
Removing: /home/user/Photos/Toronto2024 (1).jpg
/home/user/Photos/Paris2023.jpg
/home/user/Photos/Paris2023 (1).jpg
Removing: /home/user/Photos/Paris2023 (1).jpg
The script scans the specified directory and its subdirectories for checksums.md5 files (generated by generate_checksums_recursively.sh), which contain pre-computed hashes of the files. It then groups files with identical hashes, listing each group of duplicates to help users manually review and decide how to handle these duplicates.
To find duplicates in the 'Photos' directory:
./generate_checksums_recursively.sh ~/Photos
./find_duplicates_recursively.sh ~/Photos
Example output:
Processing /home/pavel/Photos/Berlin/checksums.md5
Processing /home/pavel/Photos/Boston/checksums.md5
Processing /home/pavel/Photos/All/checksums.md5
c30f7472766d25af1dc80b3ffc9a58c7
/home/pavel/Photos/Berlin/1.jpg
/home/pavel/Photos/Berlin/11.jpg
/home/pavel/Photos/All/Berlin.jpg
26ab0db90d72e28ad0ba1e22ee510510
/home/pavel/Photos/Boston/2.jpg
/home/pavel/Photos/Boston/22.jpg
/home/pavel/Photos/All/Boston.jpg
The script designed to automate the process of identifying and removing duplicate files within a specified directory and its subdirectories, leveraging existing checksums.md5 files (generated by generate_checksums_recursively.sh). The script identifies duplicates based on their hashes, retains the file with the shortest path in each set of duplicates, and removes the others.
To remove duplicates in the 'Photos' directory:
./generate_checksums_recursively.sh ~/Photos
./find_duplicates_recursively.sh ~/Photos
Example output:
Processing /home/pavel/Photos/Berlin/checksums.md5
Processing /home/pavel/Photos/Boston/checksums.md5
Processing /home/pavel/Photos/All/checksums.md5
Removing: /home/pavel/Photos/Berlin/11.jpg
Removing: /home/pavel/Photos/All/Berlin.jpg
Removing: /home/pavel/Photos/Boston/22.jpg
Removing: /home/pavel/Photos/All/Boston.jpg