image-to-text
There are 308 repositories under image-to-text topic.
thiagoalessio/tesseract-ocr-for-php
A wrapper to work with Tesseract OCR inside PHP.
killkimno/MORT
MORT 번역기 프로젝트 - Real-time game translator with OCR
lucidrains/CoCa-pytorch
Implementation of CoCa, Contrastive Captioners are Image-Text Foundation Models, in Pytorch
PaddlePaddle/PaddleMIX
Paddle Multimodal Integration and eXploration, supporting mainstream multi-modal tasks, including end-to-end large-scale multi-modal pretrain models and diffusion model toolbox. Equipped with high performance and flexibility.
Flame-Code-VLM/Flame-Code-VLM
Flame is an open-source multimodal AI system designed to translate UI design mockups into high-quality React code. It leverages vision-language modeling, automated data synthesis, and structured training workflows to bridge the gap between design and front-end development.
zapolnoch/node-tesseract-ocr
A Node.js wrapper for the Tesseract OCR API
google/imageinwords
Data release for the ImageInWords (IIW) paper.
Yushi-Hu/tifa
TIFA: Accurate and Interpretable Text-to-Image Faithfulness Evaluation with Question Answering
NormXU/nougat-latex-ocr
Codebase for fine-tuning / evaluating nougat-based image2latex generation models
shoryasethia/markdrop
A Python package for converting PDFs to markdown while extracting images and tables, generate descriptive text descriptions for extracted tables/images using several LLM clients. And many more functionalities. Markdrop is available on PyPI.
yardstick17/image_text_reader
The module extracts text from image using the tesseract-OCR engine. Generally, text present in the images are blur or are of uneven sizes. The image is pre-processed for better comprehension by OCR. This module first makes bounding box for text in images and then normalizes it to 300 dpi, suitable for OCR engine to read.
nateshmbhat/card-scanner-flutter
A flutter package for Fast, Accurate and Secure Credit card & Debit card scanning
BEPb/image_to_ascii
Everything is very simple: you either download a picture file or specify its link when running a python script, and output you get a text file, and you can immediately view on the command line how it will look the result of your conversion.
mshdabiola/NotePad
Notepad is multi module Jetpack compose note taking app with sketch pad, voice recorder, image capturing app
NanoNets/ocr-python
OCR library to extract text & tables from PDF files and images. Convert any image or PDF to CSV / TXT / JSON / Searchable PDF.
MIMICLab/L-Verse
L-Verse: Bidirectional Generation Between Image and Text
untrix/im2latex
Solution to im2latex request for research of openai
farhanchoudhary/PAN_Card_OCR_Project
To extract details from Indian National Identification Cards such as PAN (completed) & Aadhar, Passport, Driving License (WIP) in a structured format
Carleslc/ImageToText
OCR with Google's AI technology (Cloud Vision API)
thanhkeke97/RSTGameTranslation
🎮 Real-time Game Translation Tool | OCR + AI Translation | Windows Gaming | Open Source
glami/glami-1m
The largest multilingual image-text classification dataset. It contains fashion products.
Gen-Verse/HermesFlow
HermesFlow: Seamlessly Closing the Gap in Multimodal Understanding and Generation
bensonruan/Tesseract-OCR
Tesseract.js OCR
aimagelab/safe-clip
Safe-CLIP: Removing NSFW Concepts from Vision-and-Language Models. ECCV 2024
fny/swiftocr
macOS OCR command-line tool for almost any image format
geoffsmith82/Symposium2023
Demonstrates Voice Recognition, Text to Speech, Language Translation, OAuth2, Image Generation, Face Detection and Voice Chatbot.
amit-y11/the_ocr_bot
Telegram bot to convert image to text using python
pharmapsychotic/comfy-cliption
Image to text with CLIP ViT-L/14 in ComfyUI
Zebbeni/ansizalizer
A TUI to convert Images to ANSI strings using bubbletea
zhangming8/Dango-ocr
DangoOCR: screenshot OCR recognize 文字识别,支持多种语言,识别后翻译,播放声音
DS2BRAIN/ds2
Easiest way to use AI models without coding (Web UI & API support)
Akascape/TEXTEMAGE
A simple image to text converter with GUI!
faizan619/Codo-File
Codo-File is a code editor that primarily supports JavaScript and Python, with partial Dart support. Additionally, it features a real-time website editor where you can create your own website in the browser using HTML, CSS, and JavaScript. The project also includes an image-to-text feature and a voice-to-text feature .
pollinations-ai/pollinations.ai
Work with the best generative AI from Pollinations using this Python SDK. 🐝
torresflo/Tag-Machine
A little Python application to auto tag your photos with the power of machine learning.
MSmaili/note-it
OCR functionality in a feature-rich note-taking extension.