A django application to extract text from single page pdf and images.
- Works also with scanned pdf
- Works with only single page pdf
- imageMagick (should be installed in your system)
- Ghostscript (need to be installed in your system)
- Tesseract-OCR (need to be installed in your system)
- make sure environment variable of all the above is set correctly.
- Create a virtual environment
- install the requirements from the requirements.txt file
- setup postgres db or comment the postgres db setting and uncomment the sqlite db settings in settings.py file
- run the server