ocr-python

There are 476 repositories under ocr-python topic.

  • Umi-OCR

    hiroi-sora/Umi-OCR

    OCR software, free and offline. 开源、免费的离线OCR软件。支持截屏/批量导入图片,PDF文档识别,排除水印/页眉页脚,扫描/生成二维码。内置多国语言库。

    Language:Python39.6k2038113.9k
  • CnOCR

    breezedeus/CnOCR

    CnOCR: Awesome Chinese/English OCR Python toolkits based on PyTorch. It comes with 20+ well-trained models for different application scenarios and can be used directly after installation. 【基于 PyTorch/MXNet 的中文/英文 OCR Python 包。】

    Language:Python3.7k68263533
  • CatchTheTornado/text-extract-api

    Document (PDF, Word, PPTX ...) extraction and parse API using state of the art modern OCRs + Ollama supported models. Anonymize documents. Remove PII. Convert any document or picture to structured JSON or Markdown

    Language:Python2.9k1381247
  • hiroi-sora/Umi-OCR_v2

    结束和新的开始

    Language:QML948126387
  • fast-plate-ocr

    ankandrew/fast-plate-ocr

    Lightweight & fast OCR models for license plate text recognition.

    Language:Python31495745
  • Psarpei/Multi-Type-TD-TSR

    Extracting Tables from Document Images using a Multi-stage Pipeline for Table Detection and Table Structure Recognition

    Language:Jupyter Notebook28291553
  • maxent-ai/ocrpy

    OCR, Archive, Index and Search: Implementation agnostic OCR framework.

    Language:Jupyter Notebook2232211
  • nathanaday/RealTime-OCR

    Perform text detection in a variety of languages with your computer webcam using Google Tesseract OCR and OpenCV. This script achieves a real-time OCR effect via multi-threading.

    Language:Python1804344
  • MrZilinXiao/Hyper-Table-OCR

    A carefully-designed OCR pipeline for universal boarded table recognition and reconstruction.

    Language:C++17721545
  • genieincodebottle/parsemypdf

    Collection of PDF parsing libraries like AI based docling, claude, openai, gemini, meta's llama-vision, unstructured-io, and pdfminer, pymupdf, pdfplumber etc for efficient snapshot, text, table, and metadata extraction.

    Language:Python1523331
  • blueaxis/Cloe

    Manga OCR snipping application for desktop

    Language:Python13632311
  • ilic5000/pabkvizgenerator

    Anansi is a computer vision (cv2 and FFmpeg) + OCR (EasyOCR and tesseract) python-based crawler for finding and extracting questions and correct answers from video files of popular TV game shows in the Balkan region.

    Language:Python125517
  • shibing624/imgocr

    Python3 package for Chinese/English OCR,use paddleocr-v5 onnx model(~20MB), with ultra-fast inference speed. 基于ppocr-v5-onnx模型推理,中英文OCR开源SOTA,推理速度超快。

    Language:Python1112716
  • nainiayoub/pdf-text-data-extractor

    PDF text data extraction web app with OCR for scanned documents

    Language:Python914450
  • prp-e/persian_ocr_project

    A FLOSS software for Persian Optical Character Recognition

    Language:Jupyter Notebook908111
  • oidlabs-com/Lexoid

    Multimodal document parser for high quality data understanding and extraction

    Language:Python845678
  • kartikgill/Easter2

    Easter2.0: IMPROVING CONVOLUTIONAL MODELS FOR HANDWRITTEN TEXT RECOGNITION

    Language:Jupyter Notebook7921824
  • tamil_ocr

    gnana70/tamil_ocr

    OCR Tamil is a powerful tool that can detect and recognize text in Tamil images with high accuracy on Natural Scenes

    Language:Python734713
  • ksasso1028/EasyOCR-cpp

    Custom C++ implementation of deep learning based OCR

    Language:C++6121517
  • bentoml/BentoOCR

    Turn any OCR models into online inference API endpoint 🚀 🌖

    Language:Python56534
  • X-T-E-R/my-little-ocr

    MyLittleOCR 是一个统一的 OCR 库包装器,提供一致的 API,便于集成和切换多个 OCR 引擎。 MyLittleOCR is a unified OCR wrapper providing a consistent API for seamless integration and switching between multiple OCR engines.

    Language:Python54203
  • PSPDFKit/nutrient-dws-client-python

    Official Python client library for Nutrient Document Web Services API - PDF processing, OCR, watermarking, and document manipulation with automatic Office format conversion

    Language:Python53016
  • moheladwy/OCR4Linux

    OCR Script Tool for Extracting Text from Screenshots (images) using bash, and python scripts only

    Language:Shell49166
  • voun7/VidSubX

    A program for extracting hard coded (burned in) subtitle from a video and generating an external subtitle.

    Language:Python463119
  • MauryaRitesh/OCR-Python

    Optical Character Recognition in Python.

    Language:Jupyter Notebook442222
  • sepehrraisi/Persian-OCR

    A project to bring high accuracy OCR to Persian language.

    Language:Shell44226
  • Baskar-forever/TableExtractor-Advanced-PDF-Table-Extraction

    PDF Table Extractor is an innovative Python project designed to tackle the challenge of extracting tables from scanned PDF documents. Leveraging advanced optical character recognition (OCR) and image processing techniques.

    Language:Jupyter Notebook39108
  • zefoy-captcha-solver

    xtekky/zefoy-captcha-solver

    Zefoy OCR captcha solver | 99% accurate

    Language:Python35209
  • sergiocorreia/quipucamayoc

    dev repo for article

    Language:Python31575
  • Employee-Monitoring-Using-Object-Detection

    pgplarosa/Employee-Monitoring-Using-Object-Detection

    Deep Learning Individual Project - March 03, 2022.

    Language:HTML30205
  • ASACHIT/OCR-django-app

    A django webapp to scan text from image , faster, easy & efficient

    Language:CSS292110
  • Unstructured-IO/community

    Open source libraries and APIs to build custom preprocessing pipelines for labeling, training, or production machine learning pipelines.

  • zhiweiiii/fapiao-ocr-excel

    基于OCR技术的自动识别发票内容,导出到Excel。(自动识别图片、pdf文件)

    Language:Python26
  • ayseceyda/analog-meter-reading-openCV

    AMR (automatic meter reading) project for analog meters, built with openCV+Python using basic OCR and image processing knowledge.

    Language:Jupyter Notebook251010
  • Jan-9C/deathcounter_ocr

    A python script which detects death messages by using OCR and displays a corrosponding deathcounter. Preconfigured for Elden Ring

    Language:Python241115
  • yunwoong7/korean_ocr_using_paddleOCR

    This is a Korean OCR Python code using the paddleOCR library

    Language:Jupyter Notebook24112