/AIOCR

OCR Images via API and copy result to clipboard

Primary LanguagePythonGNU General Public License v3.0GPL-3.0

AIOCR : OCR Images via API and copy to clipboard

This program:

  1. Creates a random filename
  2. Takes a screenshot with scrot and stores it in /tmp/OCR
  3. Gets its mimetype (ancient artifact)
  4. encodes to base64
  5. OCR’s via API
  6. Copies result to clipboard with xclip
  7. Writes result to file in /tmp/OCR
  8. Sends a notification with notify-send

Required Tools

Requires the following tools:

  • pwgen
  • scrot
  • base64
  • xclip
  • notify-send

API Key

You need to s/YOUR_API_KEY_HERE/[ API key from https://openrouter.ai/settings/keys ]/

Usage

Run with uv --cache-dir=/tmp/uv run aiocr.py Bind it to a shortcut. Your cursor will change, this is scrot ’s doing. Select an area and it will (hopefully) be transcribed.

Caveats

  • The prompt is geared towards Org-Mode users. Modify it to get output in MarkDown (or whatever you prefer).
  • It uses a free model and it sometimes drops requests. Change to a better / paid model to increase reliability.