pdf-ocr-extraction

There are 9 repositories under pdf-ocr-extraction topic.

skylander86/lambda-text-extractor
AWS Lambda functions to extract text from various binary formats.
Language:Python174 9 543
Clearedge-AI/clearedge
Build a RAG preprocessing pipeline
Language:Jupyter Notebook11 3 00
omaxel/pdf-ocr
Recognize page content of a PDF as text using Tesseract and Ghostscript.
Language:C#7 2 01
Achiwilms/OCR-Wizard
A powerful and user-friendly tool based on OCRmyPDF, offering a seamless GUI for conversion of image-based PDFs into searchable text.
Language:Python5 1 01
BBC-Esq/Fast-PyOCR
Simple and reliable script to conduct high-quality fast OCR on a PDF
Language:Python2 1 10
fsdesa/pdf-ocr-service
PDF OCR service in docker
Language:Java1 1 00
lakshay1296/OCR_Django_App_Beta
Example Django-Python project which contains OCR, PDF to OCR PDF, Text Similarity/Dissimilarity, PDF to PNG converter modules.
Language:Python1 1 00
Firefox-1998/UtilityPDF
Utility with collect in one place, some operations that are normally done on PDF files.
Language:C#0 1 00
mcagriaksoy/diff_merge_pdf
A tool for compare, merge, display difference and make OCR between the PDFs.
Language:Python2 01