/pdf_ocr_app

A ready-to-deploy Dash application for parsing PDF files with Tesseract

Primary LanguagePythonMIT LicenseMIT

PDF OCR APP

Build Status Code Coverage

A ready-to-deploy Dash application for parsing PDF files with Tesseract

Features

  • Performs OCR on user uploaded PDF documents
  • Displays low level tesseract-OCR results
  • Enables SVG download

Example app here

Quick Start

git clone git@github.com:Envinorma/pdf_ocr_app.git
cp pdf_ocr_app
virtualenv venv
source venv/bin/activate
pip install -r requirements.txt
cp config-template.ini config.ini # Adapt configuration
python pdf_ocr_app/app/__init__.py # Visit http://127.0.0.1:8050/

Deploy on heroku

heroku git:remote -a $app_name
heroku buildpacks:add --index 1 https://github.com/heroku/heroku-buildpack-apt
git push heroku master

MIT license