gpt-4-vision
There are 84 repositories under gpt-4-vision topic.
lobehub/lobe-chat
🤯 Lobe Chat - an open-source, modern-design AI chat framework. Supports Multi AI Providers( OpenAI / Claude 3 / Gemini / Ollama / Qwen / DeepSeek), Knowledge Base (file upload / knowledge management / RAG ), Multi-Modals (Vision/TTS/Plugins/Artifacts). One-click FREE deployment of your private ChatGPT/ Claude application.
szczyglis-dev/py-gpt
Desktop AI Assistant powered by o1, GPT-4, GPT-4 Vision, Gemini, Claude, Llama 3, Bielik, DALL-E, Langchain, Llama-index, chat, vision, voice control, image generation and analysis, agents, command execution, file upload/download, speech synthesis and recognition, access to Web, memory, presets, assistants, plugins, and more. Linux, Windows, Mac.
Skythinker616/gpt-assistant-android
免费的ChatGPT API的安卓语音助手,可用音量键唤起并进行语音交流,支持联网、Vision拍照识图、提问模板等功能 | A free ChatGPT API voice assistant for Android, activated via volume keys for voice interaction, supporting features such as network connectivity, Vision photo recognition, and question templates.
lancedb/vectordb-recipes
High quality resources & applications for LLMs, multi-modal models and VectorDBs
SkalskiP/sports
Cool experiments at the intersection of Computer Vision and Sports ⚽🏃
TypingMind/typingmind
The most advanced Web UI for AI chat
WisconsinAIVision/ViP-LLaVA
[CVPR2024] ViP-LLaVA: Making Large Multimodal Models Understand Arbitrary Visual Prompts
developersdigest/ai-devices
AI Device Template Featuring Whisper, TTS, Groq, Llama3, OpenAI and more
tbckr/sgpt
SGPT is a command-line tool that provides a convenient way to interact with OpenAI models, enabling users to run queries, generate shell commands and produce code directly from the terminal.
vdutts7/gpt4V-scraper
AI agent that can SEE 👁️, control, navigate, & do stuff for you on your browser.
davidmigloz/pixels2flutter
Convert a screenshot to a working Flutter app.
ktutak1337/Stellar-Chat
A versatile multi-modal chat application that enables users to develop custom agents, create images, leverage visual recognition, and engage in voice interactions. It integrates seamlessly with local LLMs and commercial models like OpenAI, Gemini, Perplexity, and Claude, and allows to converse with uploaded documents and websites.
mountaineerbr/shellChatGPT
Shell wrapper for OpenAI's ChatGPT, DALL-E, Whisper, and TTS. Features LocalAI, Ollama, Gemini, Mistral, Groq, Anthropic, Novita AI, and xAI integration.
sazonovanton/SirChatalot
SirChatalot is a Telegram bot leveraging ChatGPT, Claude or YandexGPT. It uses Whisper for speech-to-text and DALL-E, Stability AI or YandexART for image creation. It can use vision capabilities or tools/functions.
animalnots/BetterChatGPT-PLUS
Maintained version of bettergpt. An amazing UI for OpenAI's ChatGPT (Website + Windows + MacOS + Linux). https://discord.gg/2CKfAbAJrH
nateraw/openai-vision-api-for-videos
Extract information, summarize, ask questions, and search videos using OpenAI's Vision API 🚀🎦
Anil-matcha/GPT-4-Vision-Chatbot
GPT-4 Vision Chatbot examples
42lux/CaptainCaption
A gradio based image captioning tool that uses the GPT-4-Vision API to generate detailed descriptions of images.
signebedi/gptty
ChatGPT wrapper in your TTY
GianfrancoCorrea/gpt-4-vision-chat
GPT 4 Turbo Vision with Chainlit
supershaneski/chatgpt-with-image-sample
This sample project integrates OpenAI's GPT-4 Vision, with advanced image recognition capabilities, and DALL·E 3, the state-of-the-art image generation model, with the Chat completions API. This powerful combination allows for simultaneous image creation and analysis.
Helltar/artific_intellig_bot
AI Telegram Bot, ChatGPT, Dalle2, Whisper, GPT-4 Vision, Stability AI
LazaUK/AOAI-GPT4Vision-Streamlit-SDKv1
Using Azure OpenAI deployment of GPT-4 Turbo with Vision to analyse out-of-stock situation in a fictitious retail shop.
neka-nat/mylangrobot
Language instructions to mycobot using GPT-4V
waseemhnyc/object-detection-openai
Object detection using Open AI Vision Model
jeremy-collins/gpt4v-screenshot-analyzer
This tool offers an interactive way to analyze and understand your screenshots using OpenAI's GPT-4 Vision API. Capture any part of your screen and engage in a dialogue with ChatGPT to uncover detailed insights, ask follow-up questions, and explore visual data in a user-friendly format.
mapluisch/GPT-4-Vision-for-HoloLens
Capture images with HoloLens and receive descriptive responses from OpenAI's GPT-4V(ision).
mickymultani/GPT-4-Vision-Architecture-Scanner
A web-based tool that utilizes GPT-4's vision capabilities to analyze and describe system architecture diagrams, providing instant insights and detailed breakdowns in an interactive chat interface.
scalable-dynamics/gpt-spa
A customizable GPT in a single page, using OpenAI models text-embedding-ada-002, tts-1, whisper-1, dall-e-3, and gpt-4-vision-preview
komzweb/nextjs-gpt4v
A simple chat app with vision using Next.js, Vercel AI SDK, and GPT-4V.
philfung/awesome-computer-use
Curated resources about automated GUI computer-use via LLMs. Highly opinionated, focus is on quality vs quantity.
kornia/pixie
Pixie: Computer Vision AI Engineer assistant
reidbarber/gen-ui
Use text or image prompts to generate components and apps built with React.
walimorris/opensquare
OSINT Platform - Provides image analysis, digital footprints, video transcription and more. Retrieval Augmented Generation (RAG) capable platform
wfce/ChatGPT-OpenAI-API
全网最低价的OpenAI ChatGPT-4-32K、ChatGPT-3.5 API 最高低于官方价42倍。The lowest-priced OpenAI ChatGPT-4-32K and ChatGPT-3.5 APIs on the entire network are 42 times lower than the official price.
jacobmarks/gpt4-vision-plugin
Chat with your images using GPT-4 Vision!