/blueprint-oss

Declarative document extraction

Primary LanguagePythonMIT LicenseMIT

Blueprint

Blueprint is a declarative extraction language for semi-structured documents.

Setup

Start by cloning this repo to your machine.

CLI

To run on a sample paystub:

  • Add path/to/blueprint-oss/blueprint/py to your PYTHONPATH
  • Run pip3 install -r path/to/blueprint-oss/blueprint/requirements.txt
  • From the blueprint/reference_extractions/paystubs folder, run python3 paystubs.py run_model -v -g ocr/sample_paystub.jpg.json

To generate your own OCR documents:

Server

TODO

Studio

TODO