/identical-file-and-directory-detector

The program detects files or directories that are duplicates of each other

Primary LanguagePythonMIT LicenseMIT

identical-file-and-directory-detector

The program traverses the directories and look for files or directories that are duplicates of each other (i.e. identical). The full pathnames of duplicates will be printed as output.

The program can be invoked with the following options and arguments:
identic [-f | -d ] [-c] [-n] [-s] [<dir1> < dir2> ..]


[-f | -d ] -f means look for identical files, -d means look for identical directories. The default is identical files.
-c Identical will mean the contents are exactly the same (note that the names can be different).
-n Identical will mean the directory/file names are exactly the same (note that the contents can be different).
-cn Identical will mean both the contents and the directory/file names are exactly the same.
[<dir1> <dir2> ..] The list of directories to traverse (note that the directories will be traversed recursively, i.e. directories and their subdirectories and their subdirectories etc. etc.). The default is current directory.
-s The size for each duplicate will also be printed. The duplicates should be printed in descending order of size. This option is ignored when –n option is used.

It is assumed that directory hierarchy forms a tree. It is assumed there are no symbolic links.
• The program uses sha256 hashes in order to locate identical items.
• To locate identical directories, hash trees are used.

Developed for CMPE 230 SYSTEMS PROGRAMMING course, Bogazici University Computer Engineering, Spring 2020.