LaurentRisser/pdf_ocr_app

A ready-to-deploy Dash application for parsing PDF files with Tesseract

PythonMIT

PDF OCR APP

A ready-to-deploy Dash application for parsing PDF files with Tesseract

Features

Performs OCR on user uploaded PDF documents
Displays low level tesseract-OCR results
Enables SVG download

Example app here

Quick Start

git clone git@github.com:Envinorma/pdf_ocr_app.git
cp pdf_ocr_app
virtualenv venv
source venv/bin/activate
pip install -r requirements.txt
cp config-template.ini config.ini # Adapt configuration
python pdf_ocr_app/app/__init__.py # Visit http://127.0.0.1:8050/

Deploy on heroku

heroku git:remote -a $app_name
heroku buildpacks:add --index 1 https://github.com/heroku/heroku-buildpack-apt
git push heroku master

MIT license