image-to-text

There are 308 repositories under image-to-text topic.

  • thiagoalessio/tesseract-ocr-for-php

    A wrapper to work with Tesseract OCR inside PHP.

    Language:PHP3k118145551
  • MORT

    killkimno/MORT

    MORT 번역기 프로젝트 - Real-time game translator with OCR

    Language:C#1.2k148067
  • lucidrains/CoCa-pytorch

    Implementation of CoCa, Contrastive Captioners are Image-Text Foundation Models, in Pytorch

    Language:Python1.2k131889
  • PaddlePaddle/PaddleMIX

    Paddle Multimodal Integration and eXploration, supporting mainstream multi-modal tasks, including end-to-end large-scale multi-modal pretrain models and diffusion model toolbox. Equipped with high performance and flexibility.

    Language:Python69724198218
  • Flame-Code-VLM

    Flame-Code-VLM/Flame-Code-VLM

    Flame is an open-source multimodal AI system designed to translate UI design mockups into high-quality React code. It leverages vision-language modeling, automated data synthesis, and structured training workflows to bridge the gap between design and front-end development.

    Language:Python5407841
  • zapolnoch/node-tesseract-ocr

    A Node.js wrapper for the Tesseract OCR API

    Language:JavaScript31332539
  • imageinwords

    google/imageinwords

    Data release for the ImageInWords (IIW) paper.

    Language:JavaScript220949
  • Yushi-Hu/tifa

    TIFA: Accurate and Interpretable Text-to-Image Faithfulness Evaluation with Question Answering

    Language:Python17321612
  • NormXU/nougat-latex-ocr

    Codebase for fine-tuning / evaluating nougat-based image2latex generation models

    Language:Python1571919
  • shoryasethia/markdrop

    A Python package for converting PDFs to markdown while extracting images and tables, generate descriptive text descriptions for extracted tables/images using several LLM clients. And many more functionalities. Markdrop is available on PyPI.

    Language:Python151163
  • yardstick17/image_text_reader

    The module extracts text from image using the tesseract-OCR engine. Generally, text present in the images are blur or are of uneven sizes. The image is pre-processed for better comprehension by OCR. This module first makes bounding box for text in images and then normalizes it to 300 dpi, suitable for OCR engine to read.

    Language:Python147121043
  • nateshmbhat/card-scanner-flutter

    A flutter package for Fast, Accurate and Secure Credit card & Debit card scanning

    Language:Swift126956114
  • BEPb/image_to_ascii

    Everything is very simple: you either download a picture file or specify its link when running a python script, and output you get a text file, and you can immediately view on the command line how it will look the result of your conversion.

    Language:Python1144011
  • mshdabiola/NotePad

    Notepad is multi module Jetpack compose note taking app with sketch pad, voice recorder, image capturing app

    Language:Kotlin1144149
  • NanoNets/ocr-python

    OCR library to extract text & tables from PDF files and images. Convert any image or PDF to CSV / TXT / JSON / Searchable PDF.

    Language:Jupyter Notebook1133614
  • MIMICLab/L-Verse

    L-Verse: Bidirectional Generation Between Image and Text

    Language:Python1091026
  • im2latex

    untrix/im2latex

    Solution to im2latex request for research of openai

    Language:Jupyter Notebook9041521
  • farhanchoudhary/PAN_Card_OCR_Project

    To extract details from Indian National Identification Cards such as PAN (completed) & Aadhar, Passport, Driving License (WIP) in a structured format

    Language:Python815265
  • Carleslc/ImageToText

    OCR with Google's AI technology (Cloud Vision API)

    Language:Python763116
  • thanhkeke97/RSTGameTranslation

    🎮 Real-time Game Translation Tool | OCR + AI Translation | Windows Gaming | Open Source

    Language:C#74
  • glami/glami-1m

    The largest multilingual image-text classification dataset. It contains fashion products.

    Language:Jupyter Notebook73627
  • Gen-Verse/HermesFlow

    HermesFlow: Seamlessly Closing the Gap in Multimodal Understanding and Generation

    Language:Python64213
  • Tesseract-OCR

    bensonruan/Tesseract-OCR

    Tesseract.js OCR

    Language:HTML623132
  • aimagelab/safe-clip

    Safe-CLIP: Removing NSFW Concepts from Vision-and-Language Models. ECCV 2024

    Language:Python61740
  • fny/swiftocr

    macOS OCR command-line tool for almost any image format

    Language:Python603
  • geoffsmith82/Symposium2023

    Demonstrates Voice Recognition, Text to Speech, Language Translation, OAuth2, Image Generation, Face Detection and Voice Chatbot.

    Language:Pascal5991112
  • amit-y11/the_ocr_bot

    Telegram bot to convert image to text using python

    Language:Python581245
  • pharmapsychotic/comfy-cliption

    Image to text with CLIP ViT-L/14 in ComfyUI

    Language:Python58242
  • Zebbeni/ansizalizer

    A TUI to convert Images to ANSI strings using bubbletea

    Language:Go54124
  • zhangming8/Dango-ocr

    DangoOCR: screenshot OCR recognize 文字识别,支持多种语言,识别后翻译,播放声音

    Language:Python53238
  • DS2BRAIN/ds2

    Easiest way to use AI models without coding (Web UI & API support)

    Language:Python50710132
  • Akascape/TEXTEMAGE

    A simple image to text converter with GUI!

    Language:Python43236
  • faizan619/Codo-File

    Codo-File is a code editor that primarily supports JavaScript and Python, with partial Dart support. Additionally, it features a real-time website editor where you can create your own website in the browser using HTML, CSS, and JavaScript. The project also includes an image-to-text feature and a voice-to-text feature .

    Language:JavaScript43115
  • pollinations.ai

    pollinations-ai/pollinations.ai

    Work with the best generative AI from Pollinations using this Python SDK. 🐝

    Language:Python43334
  • torresflo/Tag-Machine

    A little Python application to auto tag your photos with the power of machine learning.

    Language:Python42347
  • note-it

    MSmaili/note-it

    OCR functionality in a feature-rich note-taking extension.

    Language:TypeScript41