This tool will help you organize any collections of files that are standardized in some way, allowing you to batch clean filenames up or even completely rebuild them, making everything refined and easier to find.
Yes, the name comes from "refine your [photos | images | videos | movies | porn | music | etc.] collection"!
It will help you manage your file collections like no other tool! It can scan several given paths at the same time and analyze all files, performing some advanced operations on them such as finding possibly duplicated files, batch renaming them, stripping prefixes or suffixes, or even automatically grouping and rebuilding their filenames according to your rules!
It is blazingly fast and tiny, made 100% in Rust 🦀!
Install refine
with:
cargo install refine
That's it, and you can then just call it anywhere!
(click to expand)
- nicer rename command output by parent directory
- new threaded yes/no prompt that can be aborted with CTRL-C
- rename: disallow by default changes in directories where clashes are detected
- new --clashes option to allow them
- rebuild: new replace feature, finally!
- rebuild, rename: make strip options also remove
.
and_
, in addition to-
and spaces - global: include and exclude options do not check extensions
- dupes: remove case option, so everything is case-insensitive now
- global: new --dir-in and --dir-out options.
- new
rename
command - rebuild, rename: improve strip exact, not removing more spaces than needed
- global: new --exclude option to exclude files
- new support for Ctrl-C, to abort all operations and gracefully exit the program at any time.
- all commands will stop collecting files when Ctrl-C is pressed
- both
dupes
andlist
command will show partial results - the
rebuild
command will just exit, as it needs all the files to run
- new "list" command
- global: new --include option to filter input files
- rebuild: new --force option to easily rename new files
- rebuild: new interactive mode by default, making --dry_run obsolete (removed), with new --yes option to bypass it (good for automation)
- rebuild: auto fix renaming errors
- dupes: faster performance by ignoring groups with 1 file (thus avoiding loading samples)
- rebuild: smaller memory consumption by caching file extensions
All commands will:
- recursively scan all the given paths (excluding hidden .folders)
- can optionally perform only a shallow scan
- can optionally filter files based on two regexes (--include and --exclude)
- can optionally filter directories based on two regexes (--dir-in and --dir-ex)
- load the metadata the command requires to run (e.g. file size, creation date, etc.) for each file
- execute the command and print the results
refine --help
Refine your file collection using Rust!
Usage: refine [OPTIONS] [PATHS]... <COMMAND>
Commands:
dupes Find possibly duplicated files by both size and filename
rebuild Rebuild the filenames of media collections intelligently
list List files from the given paths
rename Rename files in batch, according to the given rules
help Print this message or the help of the given subcommand(s)
Options:
-h, --help Print help
-V, --version Print version
Global:
-i, --include <REGEX> Include only these files; checked against filename without extension, case-insensitive
-x, --exclude <REGEX> Exclude these files; checked against filename without extension, case-insensitive
--dir-in <REGEX> Include only these subdirectories; case-insensitive
--dir-ex <REGEX> Exclude these subdirectories; case-insensitive
--shallow Do not recurse into subdirectories
[PATHS]... Paths to scan
For more information, see https://github.com/rsalmei/refine
The dupes
command will analyze and report the possibly duplicated files, either by size or name. It will even load a sample from each file, in order to guarantee they are indeed duplicated. It is a small sample by default but can help reduce false positives a lot, and you can increase it if you want.
- group all the files by size
- for each group with the exact same value, load a sample of its files
- compare the samples with each other and find possible duplicates
- group all the files by words in their names
- the word extractor ignores sequence numbers like file-1, file copy, file-3 copy 2, etc.
- run 2. and 3. again, and print the results
refine dupes --help
Find possibly duplicated files by both size and filename
Usage: refine dupes [OPTIONS] [PATHS]...
Options:
-s, --sample <BYTES> Sample size in bytes (0 to disable) [default: 2048]
-h, --help Print help
Global:
-i, --include <REGEX> Include only these files; checked against filename without extension, case-insensitive
-x, --exclude <REGEX> Exclude these files; checked against filename without extension, case-insensitive
--dir-in <REGEX> Include only these subdirectories; case-insensitive
--dir-ex <REGEX> Exclude these subdirectories; case-insensitive
--shallow Do not recurse into subdirectories
[PATHS]... Paths to scan
Example:
❯ refine dupes ~/Downloads /Volumes/External --sample 20480
The rebuild
command is a great achievement, if I say so myself. It will smartly rebuild the filenames of an entire collection when it is composed by user ids or streamer names, for instance. It will do so by removing sequence numbers, stripping parts of filenames you don't want, smartly detecting misspelled names by comparing with adjacent files, sorting the detected groups deterministically by creation date, regenerating the sequence numbers, and finally renaming all the files accordingly. It's awesome to quickly find your video or music library neatly sorted automatically... And the next time you run it, it will detect new files added since the last time, and include them in the correct group! Pretty cool, huh? And don't worry, you can review all the changes before applying them.
- remove any sequence numbers like file-1, file copy, file-3 copy 2, etc.
- strip parts of the filenames, either before, after, or exactly a certain string
- remove spaces and underscores, and smartly detect misspelled names
- group the resulting names, and smartly choose the most likely correct name among the group
- sort the group according to the file created date
- regenerate the sequence numbers for the group <-- Note this occurs on the whole group, regardless of the directory the file currently resides in
- print the resulting changes to the filenames, and ask for confirmation
- if the user confirms, apply the changes to the filenames
refine rebuild --help
Rebuild the filenames of media collections intelligently
Usage: refine rebuild [OPTIONS] [PATHS]...
Options:
-b, --strip-before <STR|REGEX> Remove from the start of the filename to this str; blanks are automatically removed
-a, --strip-after <STR|REGEX> Remove from this str to the end of the filename; blanks are automatically removed
-e, --strip-exact <STR|REGEX> Remove all occurrences of this str in the filename; blanks are automatically removed
-s, --no-smart-detect Detect and fix similar filenames (e.g. "foo bar.mp4" and "foo__bar.mp4")
-f, --force <STR> Easily set filenames for new files. BEWARE: use only on already organized collections
-y, --yes Skip the confirmation prompt, useful for automation
-h, --help Print help
Global:
-i, --include <REGEX> Include only these files; checked against filename without extension, case-insensitive
-x, --exclude <REGEX> Exclude these files; checked against filename without extension, case-insensitive
--dir-in <REGEX> Include only these subdirectories; case-insensitive
--dir-ex <REGEX> Exclude these subdirectories; case-insensitive
--shallow Do not recurse into subdirectories
[PATHS]... Paths to scan
Example:
❯ refine rebuild ~/media /Volumes/External -a 720p -a Bluray -b xpto -e old
The list
command will gather all the files in the given paths, sort them by name, size, or path, and display them in a friendly format.
- sort all files by either name, size, or path
- ascending by default, or optionally descending
- print the results
refine list --help
List files from the given paths
Usage: refine list [OPTIONS] [PATHS]...
Options:
-b, --by <BY> Sort by [default: name] [possible values: name, size, path]
-d, --desc Use descending order
-h, --help Print help
Global:
-i, --include <REGEX> Include only these files; checked against filename without extension, case-insensitive
-x, --exclude <REGEX> Exclude these files; checked against filename without extension, case-insensitive
--dir-in <REGEX> Include only these subdirectories; case-insensitive
--dir-ex <REGEX> Exclude these subdirectories; case-insensitive
--shallow Do not recurse into subdirectories
[PATHS]... Paths to scan
Example:
❯ refine list ~/Downloads /Volumes/External --by size --desc
The rename
command will let you batch rename files like no other tool, seriously! You can quickly strip common prefixes, suffixes, and exact parts of the filenames, as well as apply any regex replacements you want. By default, in case a filename ends up clashing with other files in the same directory, that whole directory will be disallowed to make any changes. The list of clashes will be nicely formatted and printed, so you can manually check them. And you can optionally allow changes to other files in the same directory, removing only the clashes, if you find it safe.
- strip parts of the filenames, either before, after, or exactly a certain string
- apply the regex replacement rules
- remove all changes from the whole directory where clashes are detected
- optionally removes only the clashes, allowing other changes
- print the resulting changes to the filenames, and ask for confirmation
- if the user confirms, apply the changes to the filenames
refine rename --help
Rename files in batch, according to the given rules
Usage: refine rename [OPTIONS] [PATHS]...
Options:
-b, --strip-before <STR|REGEX> Remove from the start of the filename to this str; blanks are automatically removed
-a, --strip-after <STR|REGEX> Remove from this str to the end of the filename; blanks are automatically removed
-e, --strip-exact <STR|REGEX> Remove all occurrences of this str in the filename; blanks are automatically removed
-r, --replace <{STR|REGEX}=STR> Replace all occurrences of one str by another; applied in order and after the strip rules
-c, --clashes Allow changes in directories where clashes are detected
-y, --yes Skip the confirmation prompt, useful for automation
-h, --help Print help
Global:
-i, --include <REGEX> Include only these files; checked against filename without extension, case-insensitive
-x, --exclude <REGEX> Exclude these files; checked against filename without extension, case-insensitive
--dir-in <REGEX> Include only these subdirectories; case-insensitive
--dir-ex <REGEX> Exclude these subdirectories; case-insensitive
--shallow Do not recurse into subdirectories
[PATHS]... Paths to scan
Example:
❯ refine rename ~/media /Volumes/External -b "^\d+_" -r '([^\.]*?)\.=$1 '
(click to expand)
- 0.15.0 Jul 18, 2024: nicer rename command output by parent directory, new threaded yes/no prompt that can be aborted with CTRL-C
- 0.14.0 Jul 11, 2024: rename: disallow by default changes in directories where clashes are detected, including new --clashes option to allow them
- 0.13.0 Jul 10, 2024: rebuild: new replace feature, rebuild, rename: make strip options remove
.
and_
, global: include and exclude options do not check extensions, dupes: remove case option - 0.12.0 Jul 09, 2024: global: new --dir-in and --dir-out options
- 0.11.0 Jul 08, 2024: new
rename
command, rebuild, rename: improve strip exact - 0.10.0 Jul 02, 2024: global: new --exclude
- 0.9.0 Jul 01, 2024: global: support for CTRL-C
- 0.8.0 Jun 30, 2024: new
list
command - 0.7.1 Jun 28, 2024: global: --include is now case-insensitive, rebuild: fix smart detect bug not grouping some files, rebuild: strip rules remove hyphens too
- 0.7.0 Jun 27, 2024: global: new --include, rebuild: new --force, rebuild: new interactive mode, rebuild: new --yes, rebuild: auto fix rename errors, rebuild: smaller memory consumption, dupes: improved performance
- 0.6.0 Jun 24, 2024: new
rebuild
command, general polishing overall - 0.5.0 Jun 20, 2024: support for shallow scan, verbose mode, dupes cmd ignores repetition systems
- 0.4.0 Jun 17, 2024: include
dupes
command, support match case and changing sample size - 0.3.0 Nov 07, 2023: include dedup by both size and name
- 0.2.2 Jun 04, 2022: use 2KB sample size
- 0.2.1 Jun 04, 2022: improve error handling
- 0.2.0 Jun 01, 2022: publish, use split crate
human-repr
- 0.1.1 May 27, 2022: samples the center of the files, which seems to fix false positives
- 0.1.0 May 25, 2022: first release, detects duplicated files, simple sampling strategy (1KB from the start of the files)
This software is licensed under the MIT License. See the LICENSE file in the top distribution directory for the full license text.
Maintaining an open source project is hard and time-consuming, and I've put much ❤️ and effort into this.
If you've appreciated my work, you can back me up with a donation! Thank you 😊