luandro/batch_ocr

Simple NodeJS script for doing batch OCR using llama-ocr and free Together AI API

TypeScript

This will process all images in the directory and create a single output.md file containing the OCR results for all images.

Output Format

Single File

Creates a markdown file with the OCR text content
Output file is saved in the same directory as the input file
Example: image.jpg → image.md

Directory

Creates a single output.md file in the target directory
Each image's content is separated by headers

Format:

# image1.jpg

[OCR content for image1]

# image2.jpg

[OCR content for image2]

Error Handling

The script includes robust error handling:

Automatic retry (up to 3 times) for failed OCR operations
Graceful handling of invalid files/directories
Detailed error logging
Continues processing remaining files even if one fails

Technical Details

Uses llama-ocr for image processing
Implements a retry mechanism with configurable attempts and delay
Processes files asynchronously
Preserves original file names in output
Handles both file and directory inputs

Limitations

Only processes image files
Requires an active Together AI API key
Processing large images may take longer

Contributing

Contributions are welcome! Please feel free to submit a Pull Request.

License

[Your chosen license]

Acknowledgments

llama-ocr library
Together AI for OCR API services