/java-ocr-api

Java OCR allows you to perform OCR and bar code recognition on images (JPEG, PNG, TIFF, PDF, etc.) and output as plain text, xml with full coordinate as well as searchable PDF

Primary LanguageJavaGNU Affero General Public License v3.0AGPL-3.0

java-ocr-api

Build Status Java OCR allows you to perform OCR and bar code recognition on images (JPEG, PNG, TIFF, PDF, etc.) and output as plain text, xml with full coordinate as well as searchable PDF — Edit

Core Features:

  • High Level of Accuracy: recognize documents of poor image quality

  • Format Retention: text layouts on the input documents are preserved;

  • Images To Searchable PDF: convert various formats of images such as JPEG, PNG, TIFF, and PDF into searchable PDF or PDF/A files.

  • Data capture and table detection: table and cell information are available for data capture

  • 20+ Languages: e.g, English, Spanish, French, German, Italian, Hungarian, Finnish, Swedish, Romanian, Polish, Malay, Arabic, Indonesian, and Russian.

  • Barcode Recognition: the following bar code formats are supported:

    • CODE 128 (128b, 128C, 128raw)
    • EAN 8 EAN 13
    • UPC
    • code 3 of 9
    • code interleaved 2 of 5
    • QR code