/PDFHunter

A command-line tool to search through scanned/handwritten PDFs

Primary LanguagePython

PDF Hunter

A command-lind tool to find text in scanned/handwritten documents.

How to use?

Before using the python script, ensure that pytesseract, pdf2image, opencv and other supporting libraries are installed.

After installing pytesseract, make sure the pytesseract.pytesseract.tesseract_cmd variable points to your tesseract.exe location. Next, make sure poppler is installed too and the poppler_path points to the Library/bin folder.

Run the script using python main.py