gpt-4-vision

There are 84 repositories under gpt-4-vision topic.

lobehub/lobe-chat
🤯 Lobe Chat - an open-source, modern-design AI chat framework. Supports Multi AI Providers( OpenAI / Claude 3 / Gemini / Ollama / Qwen / DeepSeek), Knowledge Base (file upload / knowledge management / RAG ), Multi-Modals (Vision/TTS/Plugins/Artifacts). One-click FREE deployment of your private ChatGPT/ Claude application.
Language:TypeScript49.5k 227 2.5k10.8k
szczyglis-dev/py-gpt
Desktop AI Assistant powered by o1, GPT-4, GPT-4 Vision, Gemini, Claude, Llama 3, Bielik, DALL-E, Langchain, Llama-index, chat, vision, voice control, image generation and analysis, agents, command execution, file upload/download, speech synthesis and recognition, access to Web, memory, presets, assistants, plugins, and more. Linux, Windows, Mac.
Language:Python724 28 83147
Skythinker616/gpt-assistant-android
免费的ChatGPT API的安卓语音助手，可用音量键唤起并进行语音交流，支持联网、Vision拍照识图、提问模板等功能 | A free ChatGPT API voice assistant for Android, activated via volume keys for voice interaction, supporting features such as network connectivity, Vision photo recognition, and question templates.
Language:Java690 10 5197
lancedb/vectordb-recipes
High quality resources & applications for LLMs, multi-modal models and VectorDBs
Language:Jupyter Notebook655 9 20116
SkalskiP/sports
Cool experiments at the intersection of Computer Vision and Sports ⚽🏃
Language:Jupyter Notebook484 12 333
TypingMind/typingmind
The most advanced Web UI for AI chat
Language:JavaScript454 16 0184
WisconsinAIVision/ViP-LLaVA
[CVPR2024] ViP-LLaVA: Making Large Multimodal Models Understand Arbitrary Visual Prompts
Language:Python303 6 3121
developersdigest/ai-devices
AI Device Template Featuring Whisper, TTS, Groq, Llama3, OpenAI and more
Language:TypeScript283 3 040
tbckr/sgpt
SGPT is a command-line tool that provides a convenient way to interact with OpenAI models, enabling users to run queries, generate shell commands and produce code directly from the terminal.
Language:Go283 5 5129
vdutts7/gpt4V-scraper
AI agent that can SEE 👁️, control, navigate, & do stuff for you on your browser.
Language:JavaScript265 4 125
davidmigloz/pixels2flutter
Convert a screenshot to a working Flutter app.
Language:Dart183 7 1139
ktutak1337/Stellar-Chat
A versatile multi-modal chat application that enables users to develop custom agents, create images, leverage visual recognition, and engage in voice interactions. It integrates seamlessly with local LLMs and commercial models like OpenAI, Gemini, Perplexity, and Claude, and allows to converse with uploaded documents and websites.
Language:C#117 3 18
mountaineerbr/shellChatGPT
Shell wrapper for OpenAI's ChatGPT, DALL-E, Whisper, and TTS. Features LocalAI, Ollama, Gemini, Mistral, Groq, Anthropic, Novita AI, and xAI integration.
Language:Shell71 0 15
sazonovanton/SirChatalot
SirChatalot is a Telegram bot leveraging ChatGPT, Claude or YandexGPT. It uses Whisper for speech-to-text and DALL-E, Stability AI or YandexART for image creation. It can use vision capabilities or tools/functions.
Language:Python70 5 812
animalnots/BetterChatGPT-PLUS
Maintained version of bettergpt. An amazing UI for OpenAI's ChatGPT (Website + Windows + MacOS + Linux). https://discord.gg/2CKfAbAJrH
Language:TypeScript66 4 3031
nateraw/openai-vision-api-for-videos
Extract information, summarize, ask questions, and search videos using OpenAI's Vision API 🚀🎦
Language:Jupyter Notebook61 3 19
Anil-matcha/GPT-4-Vision-Chatbot
GPT-4 Vision Chatbot examples
Language:Jupyter Notebook58 3 115
42lux/CaptainCaption
A gradio based image captioning tool that uses the GPT-4-Vision API to generate detailed descriptions of images.
Language:Python54 3 38
signebedi/gptty
ChatGPT wrapper in your TTY
Language:Python50 3 707
GianfrancoCorrea/gpt-4-vision-chat
GPT 4 Turbo Vision with Chainlit
Language:Python31 2 45
supershaneski/chatgpt-with-image-sample
This sample project integrates OpenAI's GPT-4 Vision, with advanced image recognition capabilities, and DALL·E 3, the state-of-the-art image generation model, with the Chat completions API. This powerful combination allows for simultaneous image creation and analysis.
Language:JavaScript24 2 111
Helltar/artific_intellig_bot
AI Telegram Bot, ChatGPT, Dalle2, Whisper, GPT-4 Vision, Stability AI
Language:Kotlin22 4 34
LazaUK/AOAI-GPT4Vision-Streamlit-SDKv1
Using Azure OpenAI deployment of GPT-4 Turbo with Vision to analyse out-of-stock situation in a fictitious retail shop.
Language:Python22 3 06
neka-nat/mylangrobot
Language instructions to mycobot using GPT-4V
Language:Python18 2 00
waseemhnyc/object-detection-openai
Object detection using Open AI Vision Model
Language:Python18 2 03
jeremy-collins/gpt4v-screenshot-analyzer
This tool offers an interactive way to analyze and understand your screenshots using OpenAI's GPT-4 Vision API. Capture any part of your screen and engage in a dialogue with ChatGPT to uncover detailed insights, ask follow-up questions, and explore visual data in a user-friendly format.
Language:Python16 2 14
mapluisch/GPT-4-Vision-for-HoloLens
Capture images with HoloLens and receive descriptive responses from OpenAI's GPT-4V(ision).
Language:ShaderLab14 2 02
mickymultani/GPT-4-Vision-Architecture-Scanner
A web-based tool that utilizes GPT-4's vision capabilities to analyze and describe system architecture diagrams, providing instant insights and detailed breakdowns in an interactive chat interface.
Language:JavaScript14 3 22
scalable-dynamics/gpt-spa
A customizable GPT in a single page, using OpenAI models text-embedding-ada-002, tts-1, whisper-1, dall-e-3, and gpt-4-vision-preview
Language:JavaScript14 1 13
komzweb/nextjs-gpt4v
A simple chat app with vision using Next.js, Vercel AI SDK, and GPT-4V.
Language:TypeScript13 2 07
philfung/awesome-computer-use
Curated resources about automated GUI computer-use via LLMs. Highly opinionated, focus is on quality vs quantity.
13 1 02
kornia/pixie
Pixie: Computer Vision AI Engineer assistant
12 6 12
reidbarber/gen-ui
Use text or image prompts to generate components and apps built with React.
Language:TypeScript11 3 113
walimorris/opensquare
OSINT Platform - Provides image analysis, digital footprints, video transcription and more. Retrieval Augmented Generation (RAG) capable platform
Language:Java11 0 01
wfce/ChatGPT-OpenAI-API
全网最低价的OpenAI ChatGPT-4-32K、ChatGPT-3.5 API 最高低于官方价42倍。The lowest-priced OpenAI ChatGPT-4-32K and ChatGPT-3.5 APIs on the entire network are 42 times lower than the official price.
10 1 71
jacobmarks/gpt4-vision-plugin
Chat with your images using GPT-4 Vision!
Language:Python9 3 03

gpt-4-vision

lobehub/lobe-chat

szczyglis-dev/py-gpt

Skythinker616/gpt-assistant-android

lancedb/vectordb-recipes

SkalskiP/sports

TypingMind/typingmind

WisconsinAIVision/ViP-LLaVA

developersdigest/ai-devices

tbckr/sgpt

vdutts7/gpt4V-scraper

davidmigloz/pixels2flutter

ktutak1337/Stellar-Chat

mountaineerbr/shellChatGPT

sazonovanton/SirChatalot

animalnots/BetterChatGPT-PLUS

nateraw/openai-vision-api-for-videos

Anil-matcha/GPT-4-Vision-Chatbot

42lux/CaptainCaption

signebedi/gptty

GianfrancoCorrea/gpt-4-vision-chat

supershaneski/chatgpt-with-image-sample

Helltar/artific_intellig_bot

LazaUK/AOAI-GPT4Vision-Streamlit-SDKv1

neka-nat/mylangrobot

waseemhnyc/object-detection-openai

jeremy-collins/gpt4v-screenshot-analyzer

mapluisch/GPT-4-Vision-for-HoloLens

mickymultani/GPT-4-Vision-Architecture-Scanner

scalable-dynamics/gpt-spa

komzweb/nextjs-gpt4v

philfung/awesome-computer-use

kornia/pixie

reidbarber/gen-ui

walimorris/opensquare

wfce/ChatGPT-OpenAI-API

jacobmarks/gpt4-vision-plugin