Tooling to extract data from scanned paper forms OCR-ed by Tesseract using the HOCR standard.
Primary LanguagePython