ocr-python
There are 479 repositories under ocr-python topic.
IDPL-PFOD
An Image Dataset of Printed Farsi Text for OCR Research
docai
DocAI helps developers quickly build document, image and text processing pipelines using open source and cloud-based machine learning models for a wide range of applications
transaction_ocr
The open source extract transaction infomation by using OCR.
pdftotext
A simple pdftotext conversion tool for Windows 8.1/10/11 and FEDORA/UBUNTU/DEBIAN/ARCH based linux distros using poppler-utils and Google's tesseract-ocr.
python-OCR
Converting invoice pdf to image, image to text and then get, from the text, invoice informations like invoice number or vendor name
OpenCV-OCR
OpenCV OCR (Optical Character Recognition)
Menu_Reader
This is a web application that converts restaurant menus into text using OCR. That text is then sent through a Machine Learning model to output a list of menu items using classification and NLP.
PDF-Converter
Convert your PDF files into word documents or different image formats locally without uploading some servers unknown.
Markdownify
Convert documents, images to high-quality Markdown using Vision LLMs. Built for RAG ingestion pipelines.
ScaleDP
ScaleDP is an Open-Source extension of Apache Spark for Document Processing
taco-box
An implementation of Tiling and Corruption (TACo) Augmentations for OCR/HTR
Multimodal-OCR
Vision Language Model : tailored for tasks that involve [messy] optical character recognition (ocr), image-to-text conversion, and math problem solving with latex formatting.
MTG-OCR-Imagehashing
A self contained jupyter notebook demo showing how Tesseract OCR & Imagehashing can be used to recognize Magic Cards. This demo is meant to show how slow & inefficient these methods can be.
EasyOCR-based-Automatic-Bangla-License-Plate-Recognition
EasyOCR is basically Optical Character Reading package that belongs from PyTorch. Using this texts from the images can be extracted easily, documents, texts can be scanned. For License Plate's Number Recognition, it can be applicable easily as it can extract the texts. About License Plate's Number, there are several language's character plates are in the world, Bangla is one of them. Here EasyOCR is applied for Bangla Character Based License Plate Recognition.
Image-table-to-text-
Extracting tabular data from the image and storing it in CSV.
ISBN-Book-OCR
Image to text recognition for ISBN numbers from books.
CaptchaSolver
A tiny program to solve the thousand captcha image for testing the quality of the OCR. (Optical character recognition)
OCR-Wizard
A powerful and user-friendly tool based on OCRmyPDF, offering a seamless GUI for conversion of image-based PDFs into searchable text.
Persian-OCR-Streamlit
Persian OCR allows users to scan documents and extract text from scanned image.
docling_ocr
A powerful Python package for extracting text from images and documents using the SmolDocling-256M-preview advanced LLM-based models.
AdvAITelegramBot
Telegram Advance AI ChatBot: GPT-4.1, Qwen-3, DeepSeek-R1, Dall-E-3, Flux, Flux-Pro, Dall-E Model, OCR and Google Voice2Text.
EasyPaddleOCR
A simple package for PaddleOCR on CPU and GPU using PyTorch
VideoSubOCR
OCR automation for VideoSubFinder
mathpx
OCR for Mathematical equations
Tools_DeepSeekOCR
A Windows-based screenshot OCR utility powered by DeepSeek-OCR. This tool allows users to quickly capture screen regions and perform high-accuracy Optical Character Recognition (OCR) directly on the captured image, leveraging the powerful DeepSeek-OCR model. It supports local model deployment and features real-time model output streaming.
queueit-captcha-handler
Queue-it Captchas (BotDetect) Handler API
Discord-OCR-Bot
This is an OCR Bot for Discord made using OpenCV and Pytesseract
pdf-ocr
Converts scanned PDF documents to multiple formats using Optical Character Recognition
trOCR
Handwritten Text Recognition
screenshot-OCR
Desktop application that lets the user extract text from images by just marking a section of the screen, instead of having to load an image file. Serves as a front-end for the Tesseract OCR Engine.
HandWritenSignatureDetection
Deep Learning based Signature Detection (YOLOv5x)
Repo-2020
Machine Learning, Google Cloud and Quantitative Algorithms for Stocks Trading
pisahkan-ktp
Python Package for Information Extraction and Segmentation - Segmentasi KTP Indonesia - Indonesian ID Card - Information Segmentation
Proyecto-Deteccion_de_Matriculas
Se usa YOLOv10 para detectar vehículos en la vía, para luego detectar sus matriculas y usar tesseract-OCR para leer las matrículas
Akshara-Jaana
A OCR Project for Reading New and Old Kannada Texts