LyonSyonII/hunt-rs

Add a dedup feature

amatisara opened this issue · 0 comments

Greetings @LyonSyonII Liam, sending peace.

I write because I have been using your hunt-rs project/tool to find files on a Linux file system. Thank you. It worked so well and so fast, it inspired me to try and learn Rust.

I write to request 2 features to your script around file duplication. This will help to round out the feature set.

  1. The script should return a list of unique files, i.e. the code does a hash/digest on each file and insert it into a HashMap along with the path. It would also ignore the next file match.

  2. The script will also keep track of a list of duplicated files found and also display these.

You can handle this use case under an option like --duplicates or --unique as an example.

The use cases are:

  1. Virtual environments and containerization (python, java scripts, docker), tends to copy or download multiple copies of the same files, which are sometimes left abandoned.

  2. Installation of multiple similar software tend to duplicate their config files. For example, Nginx and Apache my have duplicate in config, certs, key, and security settings.

  3. Admin simply needs to know.

Of course, duplicated files take up unnecessary space. But more importantly, if these files are left for too long without being updated, they could potentially be a security risk.

One Love.