Image-Sort-Utility

Pick an image folder, pick a directory with subfolders, lean back and sort your images using your speech.

MacOS Binary available in releases Tested on Mac OS Mojave 10.14.6. Catalina support is WIP.

Usage

Open the application and use speech to categorize the shown images from your directory. The images are copied into the corresponding folders.

Confirm the Run OCR dialog at the end to also extract text from the images. All images in the destination folders are run through OCR and their contents are stored in /images_text.txt.

You can also Menu > Run OCR to run the OCR on an arbitrary directory. When finished this will show a dialog to confirm and then automatically close the program.

Notes

Current OCR language is tesseract's default english 'eng'
Categories are first-level folders.
For OCR, only images in root and first-level folders, hence, categories, are parsed.

Tech

Install

Copy ImageSort into your application folder and run it
Accept mic and folder access when prompted
Select your image folder, first, and then the directory containing the folders to sort the images into. The rest is explained in the application.

Development

Install tesseract and make sure its binary is in your path.

brew install tesseract

Use python 3.6.1+ Install further dependencies

pip install -r requirements.txt

Set DEV = True in _constants.py Run

python .

Bundling

Set DEV = False in _constants.py For building the release you need to put the tesseract files into the root directory of the executable via a .spec file with PyInstaller Use something like Tree(<tesseract dir>) after a.binaries, to simply put the tesseract files into the executable.

Languages

You can change the OCRs language by changing OCR_LANG i.e. to eng+fra in _constants.py and downloading the corresponding dataset from Tesseract-OCR lang into your 'tessdata/' folder. For bundling, put the .traineddata file into dist/tesseract/share/testdata/ before running pyinstaller.

xanpj/Image-Sort-Utility