This will process all images in the directory and create a single output.md
file containing the OCR results for all images.
- Creates a markdown file with the OCR text content
- Output file is saved in the same directory as the input file
- Example:
image.jpg
→image.md
- Creates a single
output.md
file in the target directory - Each image's content is separated by headers
- Format:
# image1.jpg [OCR content for image1] # image2.jpg [OCR content for image2]
The script includes robust error handling:
- Automatic retry (up to 3 times) for failed OCR operations
- Graceful handling of invalid files/directories
- Detailed error logging
- Continues processing remaining files even if one fails
- Uses
llama-ocr
for image processing - Implements a retry mechanism with configurable attempts and delay
- Processes files asynchronously
- Preserves original file names in output
- Handles both file and directory inputs
- Only processes image files
- Requires an active Together AI API key
- Processing large images may take longer
Contributions are welcome! Please feel free to submit a Pull Request.
[Your chosen license]
- llama-ocr library
- Together AI for OCR API services