/ocrd_origami

OCR-D wrapper for poke1024/origami OLR+OCR

Primary LanguagePython

ocrd_origami

OCR-D wrapper for poke1024/origami OLR+OCR

Introduction

This offers OCR-D compliant workspace processors for Origami, the document image processing suite for historical newspapers.

... WORK IN PROGRESS ...

Installation

First install system dependencies:

sudo make deps-ubuntu

(Besides Python>=3.7 you'll need at least libffi-dev, libcgal-dev and git, plus a recent tesseract.)

Now clone the subrepository, if you have not already:

make origami

Which is the equivalent of:

git submodule update --init origami

Create and activate a virtual environment as usual.

To install Python dependencies:

make deps

Which is the equivalent of:

pip install -r requirements.txt
pip install -r origami/requirements/pip.txt
pip install -r origami/requirements/conda.txt

To install this module, do:

make install

Which is the equivalent of:

pip install .

Usage

OCR-D processor interface ocrd-origami-segment

To be used with PAGE-XML documents in an OCR-D annotation workflow.

... SHOW OCRD CLI HERE...

Testing

(not yet)