image-to-text

There are 308 repositories under image-to-text topic.

thiagoalessio/tesseract-ocr-for-php
A wrapper to work with Tesseract OCR inside PHP.
Language:PHP3k 118 145551
killkimno/MORT
MORT 번역기 프로젝트 - Real-time game translator with OCR
Language:C#1.2k 14 8067
lucidrains/CoCa-pytorch
Implementation of CoCa, Contrastive Captioners are Image-Text Foundation Models, in Pytorch
Language:Python1.2k 13 1889
PaddlePaddle/PaddleMIX
Paddle Multimodal Integration and eXploration, supporting mainstream multi-modal tasks, including end-to-end large-scale multi-modal pretrain models and diffusion model toolbox. Equipped with high performance and flexibility.
Language:Python697 24 198218
Flame-Code-VLM/Flame-Code-VLM
Flame is an open-source multimodal AI system designed to translate UI design mockups into high-quality React code. It leverages vision-language modeling, automated data synthesis, and structured training workflows to bridge the gap between design and front-end development.
Language:Python540 7 841
zapolnoch/node-tesseract-ocr
A Node.js wrapper for the Tesseract OCR API
Language:JavaScript313 3 2539
google/imageinwords
Data release for the ImageInWords (IIW) paper.
Language:JavaScript220 9 49
Yushi-Hu/tifa
TIFA: Accurate and Interpretable Text-to-Image Faithfulness Evaluation with Question Answering
Language:Python173 2 1612
NormXU/nougat-latex-ocr
Codebase for fine-tuning / evaluating nougat-based image2latex generation models
Language:Python157 1 919
shoryasethia/markdrop
A Python package for converting PDFs to markdown while extracting images and tables, generate descriptive text descriptions for extracted tables/images using several LLM clients. And many more functionalities. Markdrop is available on PyPI.
Language:Python151 1 63
yardstick17/image_text_reader
The module extracts text from image using the tesseract-OCR engine. Generally, text present in the images are blur or are of uneven sizes. The image is pre-processed for better comprehension by OCR. This module first makes bounding box for text in images and then normalizes it to 300 dpi, suitable for OCR engine to read.
Language:Python147 12 1043
nateshmbhat/card-scanner-flutter
A flutter package for Fast, Accurate and Secure Credit card & Debit card scanning
Language:Swift126 9 56114
BEPb/image_to_ascii
Everything is very simple: you either download a picture file or specify its link when running a python script, and output you get a text file, and you can immediately view on the command line how it will look the result of your conversion.
Language:Python114 4 011
mshdabiola/NotePad
Notepad is multi module Jetpack compose note taking app with sketch pad, voice recorder, image capturing app
Language:Kotlin114 4 149
NanoNets/ocr-python
OCR library to extract text & tables from PDF files and images. Convert any image or PDF to CSV / TXT / JSON / Searchable PDF.
Language:Jupyter Notebook113 3 614
MIMICLab/L-Verse
L-Verse: Bidirectional Generation Between Image and Text
Language:Python109 10 26
untrix/im2latex
Solution to im2latex request for research of openai
Language:Jupyter Notebook90 4 1521
farhanchoudhary/PAN_Card_OCR_Project
To extract details from Indian National Identification Cards such as PAN (completed) & Aadhar, Passport, Driving License (WIP) in a structured format
Language:Python81 5 265
Carleslc/ImageToText
OCR with Google's AI technology (Cloud Vision API)
Language:Python76 3 116
thanhkeke97/RSTGameTranslation
🎮 Real-time Game Translation Tool | OCR + AI Translation | Windows Gaming | Open Source
Language:C#74
glami/glami-1m
The largest multilingual image-text classification dataset. It contains fashion products.
Language:Jupyter Notebook73 6 27
Gen-Verse/HermesFlow
HermesFlow: Seamlessly Closing the Gap in Multimodal Understanding and Generation
Language:Python64 2 13
bensonruan/Tesseract-OCR
Tesseract.js OCR
Language:HTML62 3 132
aimagelab/safe-clip
Safe-CLIP: Removing NSFW Concepts from Vision-and-Language Models. ECCV 2024
Language:Python61 7 40
fny/swiftocr
macOS OCR command-line tool for almost any image format
Language:Python603
geoffsmith82/Symposium2023
Demonstrates Voice Recognition, Text to Speech, Language Translation, OAuth2, Image Generation, Face Detection and Voice Chatbot.
Language:Pascal59 9 1112
amit-y11/the_ocr_bot
Telegram bot to convert image to text using python
Language:Python58 1 245
pharmapsychotic/comfy-cliption
Image to text with CLIP ViT-L/14 in ComfyUI
Language:Python58 2 42
Zebbeni/ansizalizer
A TUI to convert Images to ANSI strings using bubbletea
Language:Go54 1 24
zhangming8/Dango-ocr
DangoOCR: screenshot OCR recognize 文字识别，支持多种语言，识别后翻译，播放声音
Language:Python53 2 38
DS2BRAIN/ds2
Easiest way to use AI models without coding (Web UI & API support)
Language:Python50 7 10132
Akascape/TEXTEMAGE
A simple image to text converter with GUI!
Language:Python43 2 36
faizan619/Codo-File
Codo-File is a code editor that primarily supports JavaScript and Python, with partial Dart support. Additionally, it features a real-time website editor where you can create your own website in the browser using HTML, CSS, and JavaScript. The project also includes an image-to-text feature and a voice-to-text feature .
Language:JavaScript43 1 15
pollinations-ai/pollinations.ai
Work with the best generative AI from Pollinations using this Python SDK. 🐝
Language:Python43 3 34
torresflo/Tag-Machine
A little Python application to auto tag your photos with the power of machine learning.
Language:Python42 3 47
MSmaili/note-it
OCR functionality in a feature-rich note-taking extension.
Language:TypeScript41

image-to-text

thiagoalessio/tesseract-ocr-for-php

killkimno/MORT

lucidrains/CoCa-pytorch

PaddlePaddle/PaddleMIX

Flame-Code-VLM/Flame-Code-VLM

zapolnoch/node-tesseract-ocr

google/imageinwords

Yushi-Hu/tifa

NormXU/nougat-latex-ocr

shoryasethia/markdrop

yardstick17/image_text_reader

nateshmbhat/card-scanner-flutter

BEPb/image_to_ascii

mshdabiola/NotePad

NanoNets/ocr-python

MIMICLab/L-Verse

untrix/im2latex

farhanchoudhary/PAN_Card_OCR_Project

Carleslc/ImageToText

thanhkeke97/RSTGameTranslation

glami/glami-1m

Gen-Verse/HermesFlow

bensonruan/Tesseract-OCR

aimagelab/safe-clip

fny/swiftocr

geoffsmith82/Symposium2023

amit-y11/the_ocr_bot

pharmapsychotic/comfy-cliption

Zebbeni/ansizalizer

zhangming8/Dango-ocr

DS2BRAIN/ds2

Akascape/TEXTEMAGE

faizan619/Codo-File

pollinations-ai/pollinations.ai

torresflo/Tag-Machine

MSmaili/note-it