gpt4v

There are 37 repositories under gpt4v topic.

mnotgod96/AppAgent
AppAgent: Multimodal Agents as Smartphone Users, an LLM-based multimodal agent framework designed to operate smartphone apps.
Language:Python5.3k 69 86575
X-PLUG/MobileAgent
Mobile-Agent: The Powerful Mobile Device Operation Assistant Family
Language:Python3.2k 54 68302
reworkd/tarsier
Vision utilities for web interaction agents 👀
Language:Jupyter Notebook1.5k 11 1992
AmberSahdev/Open-Interface
Control Any Computer Using LLMs
Language:Python861 12 1869
bdekraker/WebcamGPT-Vision
Lightweight GPT-4 Vision processing over the Webcam
Language:JavaScript274 3 249
langgptai/Awesome-Multimodal-Prompts
Prompts of GPT-4V & DALL-E3 to full utilize the multi-modal ability. GPT4V Prompts, DALL-E3 Prompts.
228 2 016
pAIrprogio/vscode-ui-sketcher
Draw your projects to life
Language:TypeScript195 2 413
ShareGPT4Omni/ShareGPT4V
[ECCV 2024] ShareGPT4V: Improving Large Multi-modal Models with Better Captions
Language:Python178 3 174
soulteary/amazing-openai-api
Convert different model APIs into the OpenAI API format out of the box.
Language:Go146 4 811
zzxslp/MM-Navigator
GPT-4V in Wonderland: LMMs as Smartphone Agents
Language:Python129 15 52
kyegomez/MambaByte
Implementation of MambaByte in "MambaByte: Token-free Selective State Space Model" in Pytorch and Zeta
Language:Python111 4 26
cameronking4/sketch2app
The ultimate sketch to code app made using GPT4o serving 25k+ users. Choose your desired framework (React, Next, React Native, Flutter) for your app. It will instantly generate code and preview (sandbox) from a simple hand drawn sketch on paper captured from webcam
76 3 537
admineral/GPT4-Vision-React-Starter
Early Alpha Release: Chat with Your Image - Leveraging GPT-4 Vision and Function Calls for AI-Powered Image Analysis and Description
Language:TypeScript75 2 241
BUAADreamer/Chinese-LLaVA-Med
中文医学多模态大模型 Large Chinese Language-and-Vision Assistant for BioMedicine
Language:Python61 1 74
icebergov/gpt4v-video-voiceover
Video Voiceover with gpt-4o-mini
Language:Jupyter Notebook33
roboflow/gpt-checkup
Monitor the performance of OpenAI's GPT-4V model over time.
Language:HTML31 6 15
Azure-Samples/rag-as-a-service-with-vision
This repository offers a Python framework for a retrieval-augmented generation (RAG) pipeline using text and images from MHTML documents, leveraging Azure AI and OpenAI services. It includes ingestion and enrichment flows, a RAG with Vision pipeline, and evaluation tools.
Language:Python20 13 35
neka-nat/mylangrobot
Language instructions to mycobot using GPT-4V
Language:Python18 2 00
reidbarber/webmarker
Mark web pages for use with vision-language models
Language:TypeScript18 1 52
kyegomez/HRTX
Multi-Modal Multi-Embodied Hivemind-like Iteration of RTX-2
Language:Python17 4 03
logicalroot/gpt-4v-demos
🤖 GPT-4V Demos • Test the model's vision capabilities in your browser using Streamlit • Easy setup
Language:Python17 2 04
Charmve/gpt-eyes
I GAVE GPT-4 EYES!
Language:JavaScript14 3 04
GraphPKU/CoI
Chain of Images for Intuitively Reasoning
Language:Python8 4 11
easonlai/webcam_chat_with_aoai_gpt4o
Discover the GPT-4o multimodal model at Microsoft Build 2024, now with text and image capabilities. My prototype enhances chats with real-time camera snapshots, powered by Flask, OpenCV, and Azure’s OpenAI Services. It’s interactive, visual, and simple to use. Give it a try!
Language:HTML6 2 02
elizabethsiegle/stephensmithify-openaivision-sendgrid
Analyze a Video and generate commentary about it with OpenAI's GPT-4V, Text-to-speech, LangChain, Streamlit, Replit, Twilio SendGrid, and OpenCV!
Language:Python5 4 01
danomation/Discord-Vision-Bot
poc gpt-4 vision bot
Language:Python4 1 00
dceluis/vacocam_render
Vision-Assisted Camera Orientation
Language:Jupyter Notebook4 1 00
Envedity/DAIA
Digital Artificial Intelligence Agent
Language:Python3 0 00
yunwoong7/GPT-4V-Examples
Explore the power of GPT-4V with our curated examples and tutorials. This repository offers code snippets, step-by-step guides, and use case demonstrations for integrating GPT-4V into various applications. Perfect for both AI novices and experts!
Language:Jupyter Notebook3 1 00
gpt4api9/gpt4api9
麻雀GPTs-API市场
2 1 00
ethan-yz-hao/equation-ocr-app
OCR application for converting handwritten equations into LaTeX code using OpenAI's GPT-4V API, with LaTeX renderer for editing and checking (Next.js, Typescript, OpenAI GPT-4V, KaTex, Vercel)
Language:TypeScript1 1 10
jamesponddotco/allalt
[READ-ONLY] Describe images and generate alt tags for visually impaired users.
Language:Go1 1 00
Ravi-Teja-konda/TunedLlavaDelights
Explore the rich flavors of Indian desserts with TunedLlavaDelights. Utilizing the in Llava fine-tuning, our project unveils detailed nutritional profiles, taste notes, and optimal consumption times for beloved sweets. Dive into a fusion of AI innovation and culinary tradition
Language:Python1 2 00
sagentic-ai/cupid
Valentine's Day Cupid Agent
Language:TypeScript1 0 02
yunwoong7/VisionQuery-GPT-4v
VisionQuery GPT-4v is a cutting-edge tool that combines screenshot-based queries with OpenAI's GPT-4. It enables users to capture screens, ask questions, and receive insightful answers from GPT-4v, revolutionizing digital interaction and understanding.
Language:Jupyter Notebook1 1 0
metatatt/iso_bot
ISO 13485 Sniffer Bot, GPT4V with LlamaIndex embeded in React Bot UI
Language:TypeScript0 1 00

gpt4v

mnotgod96/AppAgent

X-PLUG/MobileAgent

reworkd/tarsier

AmberSahdev/Open-Interface

bdekraker/WebcamGPT-Vision

langgptai/Awesome-Multimodal-Prompts

pAIrprogio/vscode-ui-sketcher

ShareGPT4Omni/ShareGPT4V

soulteary/amazing-openai-api

zzxslp/MM-Navigator

kyegomez/MambaByte

cameronking4/sketch2app

admineral/GPT4-Vision-React-Starter

BUAADreamer/Chinese-LLaVA-Med

icebergov/gpt4v-video-voiceover

roboflow/gpt-checkup

Azure-Samples/rag-as-a-service-with-vision

neka-nat/mylangrobot

reidbarber/webmarker

kyegomez/HRTX

logicalroot/gpt-4v-demos

Charmve/gpt-eyes

GraphPKU/CoI

easonlai/webcam_chat_with_aoai_gpt4o

elizabethsiegle/stephensmithify-openaivision-sendgrid

danomation/Discord-Vision-Bot

dceluis/vacocam_render

Envedity/DAIA

yunwoong7/GPT-4V-Examples

gpt4api9/gpt4api9

ethan-yz-hao/equation-ocr-app

jamesponddotco/allalt

Ravi-Teja-konda/TunedLlavaDelights

sagentic-ai/cupid

yunwoong7/VisionQuery-GPT-4v

metatatt/iso_bot