/batch_ocr

Simple NodeJS script for doing batch OCR using llama-ocr and free Together AI API

Primary LanguageTypeScript

This will process all images in the directory and create a single output.md file containing the OCR results for all images.

Output Format

Single File

  • Creates a markdown file with the OCR text content
  • Output file is saved in the same directory as the input file
  • Example: image.jpgimage.md

Directory

  • Creates a single output.md file in the target directory
  • Each image's content is separated by headers
  • Format:
    # image1.jpg
    
    [OCR content for image1]
    
    # image2.jpg
    
    [OCR content for image2]

Error Handling

The script includes robust error handling:

  • Automatic retry (up to 3 times) for failed OCR operations
  • Graceful handling of invalid files/directories
  • Detailed error logging
  • Continues processing remaining files even if one fails

Technical Details

  • Uses llama-ocr for image processing
  • Implements a retry mechanism with configurable attempts and delay
  • Processes files asynchronously
  • Preserves original file names in output
  • Handles both file and directory inputs

Limitations

  • Only processes image files
  • Requires an active Together AI API key
  • Processing large images may take longer

Contributing

Contributions are welcome! Please feel free to submit a Pull Request.

License

[Your chosen license]

Acknowledgments