hocr
There are 38 repositories under hocr topic.
UglyToad/PdfPig
Read and extract text and other content from PDFs in C# (port of PDFBox)
manisandro/gImageReader
A Gtk/Qt front-end to tesseract-ocr.
mittagessen/kraken
OCR engine for all the languages
BobLd/DocumentLayoutAnalysis
Document Layout Analysis resources repos for development with PdfPig.
UB-Mannheim/ocr-fileformat
Validate and transform various OCR file formats (hOCR, ALTO, PAGE, FineReader)
cneud/ocr-conversion
Conversions between various OCR formats
filak/hOCR-to-ALTO
Convert between Tesseract hOCR and ALTO XML using XSL stylesheets
dbmdz/mirador-textoverlay
Text Overlay plugin for Mirador 3
UB-Mannheim/ocr-gt-tools
Ergonomic line-by-line transcription of scanned text.
dmi3kno/hocr
Text-to-tibble
fakabbir/OCR
Probabilistic Key Value pair extraction using word weights from Invoices - Non Searchable PDF
macabeus/pyslibtesseract
✏️ Integration of Tesseract for Python using a shared library
GeReV/hocr-editor-ts
A visual hOCR file editor
GeReV/HocrEditor
A visual editor for .hocr files.
hadro/new-york-city-directories
Some basic data and text extraction from the New York City Directories
ansonl/flyspacea-backend
Fly Space-A Facebook flight schedule photo aggregator and processor back-end server.
emmeryn/hocr-turtletext
A gem that parses positional text from hOCR output and provides convenience methods to find text.
hadro/brewery-guides
The data for guides to breweries across the United States from 1896 to 1918
jlieth/hocr-parser
Python parser for hOCR files using lxml
mayurcybercz/AI-Exam-evaluation
CLI-Tool to recognise handwritten text from answer sheets using Tesseract OCR. Using this extracted text to evaluate marks using NLP
ImageProcessing-ElectronicPublications/hocr-tools
Tools for manipulating and evaluating the hOCR format for representing multi-lingual OCR results by embedding them into HTML.
stefan6419846/hocr-tools
Tools for manipulating and evaluating the hOCR format for representing multi-lingual OCR results by embedding them into HTML.
trufanov-nok/tesseract2djvused
A simple Tesseract 3.02+ hOCR to djvused format converter written in Qt
Ansh420/Hocr_Preservation-in-pytesseract
Hocr is a format for OCR output that preserves the layout of the original document, and Pytesseract can output text in this format.
hnjm/kraken
OCR engine for all the languages
nuxeo-sandbox/nuxeo-platform-hocr
Perform OCR on images within Nuxeo with Tesseract and hOCR
rdmpage/hocr-proofreader
Web based JavaScript GUI library for proofreading/editing hOCR
z4y4ts/hocr-bboxes-viewer
Quick and dirty visualization of HOCR bboxes on a page
darkn3to/pdfocr
A simple Spring Boot application to convert image-based PDFs to text-embedded PDFs.
hansalemaos/tesseract_hocr_to_csv
Fast hocr to csv parser
ImageProcessing-ElectronicPublications/tesseract
Tesseract Open Source OCR Engine (main repository)
ImageProcessing-ElectronicPublications/tesseract2djvused
A simple Tesseract 3.02+ hOCR to djvused format converter written in Qt
milekpl/modi2hocr
Automatically exported from code.google.com/p/modi2hocr
Rajasekaran85/Python-TIFF-to-OCR-XML
TIFF Image - Converted into OCR XML using Tesseract
ZeinabTaghavi/opencv-python
some segment codes using in denoising