/tesseract-ocr

Node.js wrapper for Tesseract OCR CLI.

Primary LanguageJavaScriptMIT LicenseMIT

view on npm downloads per month node version build status test coverage license

Tesseract OCR for Node.js

Simple and modern Node.js wrapper implementation for Tesseract OCR CLI.

Features & Advantages

  • focus on high performance
  • both promise and callback APIs are supported
  • full test coverage
  • void of sync operations
  • no temp files are used

Usage

const recognize = require('tesseractocr')

recognize(`${__dirname}/image.png`, (err, text) => {
    if (err)
        throw err
    else
        console.log('Yay! Text recognized!', text)
})

Overall API docs

The overall API documentation can be found here

Installation

There is a hard dependency on the Tesseract project. You can find installation instructions for various platforms on the project site. For Homebrew users, the installation is quick and easy.

brew install tesseract --with-all-languages

The above will install all of the language packages available, if you don't need them all you can remove the --all-languages flag and install them manually, by downloading them to your local machine and then exposing the TESSDATA_PREFIX variable into your path:

export TESSDATA_PREFIX=~/Downloads/

You can then go about installing the Node.js package to expose the JavaScript API:

npm install tesseractocr

Tests and benchmarks

Clone the repo, npm install and then npm test or npm run benchmarks.

Changelog

The project's changelog is available here

License

MIT