/soph

Efficiently import pictures while handling duplicates gracefully

Primary LanguageHaskellMIT LicenseMIT

Soph

This is a simple utility to import pictures while handling duplicates gracefully.

Note that this is yet unstable. Current versions will almost certainly not work with future versions, due to differences in hash formatting (until I’ve added version migration support). This isn’t a big problem however, because it’s possible to just reimport all pictures.

Usage

To import all pictures from directory imports to collection:

$ soph imports collection

If similar images have been found after processing, a feh window will open with all of them (the new picture and the similar ones in the collection). Use the arrow keys to scroll through them and delete the one(s) you don’t want by pressing <Enter>, then quit with q.

Importing means: Copy the file into collection while giving it a hash-based filename. This allows a simple directory listing of collection to extract all this information. So in a way the filenames act as a database.

Here is an example run with one new image, one similar one and one exactly the same.

$ soph imports collection
[Info#init] Reading hashdir, decoding filenames and initializing database
[Info#process] Starting image processing of 3 files in import directory
[Info#process] New image at imports/002.jpg: 1 similar image(s) found
[Info#process] New image at imports/001.png: Already present as collection/b0ec08147fc1b495-0e02fe1b61760fa06703f87e8388780b01ff.png, removing the import file
[Info#process] New image at imports/003.png: New image, importing it
[Info#process] Importing new file imports/003.png into library to path collection/dea63899c760cc0b-f9f81f001c3980f81fc1fc07e0ce0ce0cecc.png
[Info#similars] Processing 1 similar images
[Info#process] New image at imports/002.jpg: 1 similar image(s) found
[Info#similar] File imports/002.jpg to import has 1 similar images:
[Info#similar]   collection/3cc098ca1efed3c5-18f307b231870071ff0f20f17f01fb01f00f.png
[Info#similar]   Opening them and the one to import in feh, delete the ones you don't want with <Enter>, then quit feh with <q>
[Info#process] Importing new file imports/002.jpg into library to path collection/6d384ae4863fc970-18f307b233870071df0f20f17f01fb01f00f.jpg
[Info] Finished import of 2 images

Note: The output is a bit misleading, since it reports the same picture twice. This is due to it actually first doing a pass through all images without asking the user for similar pictures first (it just skips them), then when that’s done it does another pass through all the similar pictures with actually asking the user this time and doing the appropriate action. Really desirable on large imports. Output will be made cleaner in future versions.

How it works

The above command will do the following

  1. Do a file listing of collection to know previously imported images
  2. Process every picture in imports (recursively)
  3. Import every picture in imports into collection
    • If the new picture is already present (same content hash), delete it
    • If the new picture is not yet present, import it
    • If the new picture has similar images present, open all of them in feh, allowing you to delete the ones you don’t want. If after that the new one wasn’t deleted, import it.

Installing

Nix (recommended)

The preferred way to install is with Nix:

$ nix-env -if .

Most derivations are cached by cache.nixos.org so it won’t take too long to build.

This also automatically makes sure feh is available.

Stack

If you don’t have Nix but Stack, you can (probably) install it with:

$ stack install

You also need to install feh.