gpt-4-vision

There are 84 repositories under gpt-4-vision topic.

  • lobe-chat

    lobehub/lobe-chat

    🤯 Lobe Chat - an open-source, modern-design AI chat framework. Supports Multi AI Providers( OpenAI / Claude 3 / Gemini / Ollama / Qwen / DeepSeek), Knowledge Base (file upload / knowledge management / RAG ), Multi-Modals (Vision/TTS/Plugins/Artifacts). One-click FREE deployment of your private ChatGPT/ Claude application.

    Language:TypeScript49.5k2272.5k10.8k
  • szczyglis-dev/py-gpt

    Desktop AI Assistant powered by o1, GPT-4, GPT-4 Vision, Gemini, Claude, Llama 3, Bielik, DALL-E, Langchain, Llama-index, chat, vision, voice control, image generation and analysis, agents, command execution, file upload/download, speech synthesis and recognition, access to Web, memory, presets, assistants, plugins, and more. Linux, Windows, Mac.

    Language:Python7242883147
  • Skythinker616/gpt-assistant-android

    免费的ChatGPT API的安卓语音助手,可用音量键唤起并进行语音交流,支持联网、Vision拍照识图、提问模板等功能 | A free ChatGPT API voice assistant for Android, activated via volume keys for voice interaction, supporting features such as network connectivity, Vision photo recognition, and question templates.

    Language:Java690105197
  • lancedb/vectordb-recipes

    High quality resources & applications for LLMs, multi-modal models and VectorDBs

    Language:Jupyter Notebook655920116
  • SkalskiP/sports

    Cool experiments at the intersection of Computer Vision and Sports ⚽🏃

    Language:Jupyter Notebook48412333
  • TypingMind/typingmind

    The most advanced Web UI for AI chat

    Language:JavaScript454160184
  • WisconsinAIVision/ViP-LLaVA

    [CVPR2024] ViP-LLaVA: Making Large Multimodal Models Understand Arbitrary Visual Prompts

    Language:Python30363121
  • developersdigest/ai-devices

    AI Device Template Featuring Whisper, TTS, Groq, Llama3, OpenAI and more

    Language:TypeScript2833040
  • tbckr/sgpt

    SGPT is a command-line tool that provides a convenient way to interact with OpenAI models, enabling users to run queries, generate shell commands and produce code directly from the terminal.

    Language:Go28355129
  • vdutts7/gpt4V-scraper

    AI agent that can SEE 👁️, control, navigate, & do stuff for you on your browser.

    Language:JavaScript2654125
  • davidmigloz/pixels2flutter

    Convert a screenshot to a working Flutter app.

    Language:Dart18371139
  • ktutak1337/Stellar-Chat

    A versatile multi-modal chat application that enables users to develop custom agents, create images, leverage visual recognition, and engage in voice interactions. It integrates seamlessly with local LLMs and commercial models like OpenAI, Gemini, Perplexity, and Claude, and allows to converse with uploaded documents and websites.

    Language:C#117318
  • shellChatGPT

    mountaineerbr/shellChatGPT

    Shell wrapper for OpenAI's ChatGPT, DALL-E, Whisper, and TTS. Features LocalAI, Ollama, Gemini, Mistral, Groq, Anthropic, Novita AI, and xAI integration.

    Language:Shell71015
  • sazonovanton/SirChatalot

    SirChatalot is a Telegram bot leveraging ChatGPT, Claude or YandexGPT. It uses Whisper for speech-to-text and DALL-E, Stability AI or YandexART for image creation. It can use vision capabilities or tools/functions.

    Language:Python705812
  • animalnots/BetterChatGPT-PLUS

    Maintained version of bettergpt. An amazing UI for OpenAI's ChatGPT (Website + Windows + MacOS + Linux). https://discord.gg/2CKfAbAJrH

    Language:TypeScript6643031
  • nateraw/openai-vision-api-for-videos

    Extract information, summarize, ask questions, and search videos using OpenAI's Vision API 🚀🎦

    Language:Jupyter Notebook61319
  • Anil-matcha/GPT-4-Vision-Chatbot

    GPT-4 Vision Chatbot examples

    Language:Jupyter Notebook583115
  • 42lux/CaptainCaption

    A gradio based image captioning tool that uses the GPT-4-Vision API to generate detailed descriptions of images.

    Language:Python54338
  • signebedi/gptty

    ChatGPT wrapper in your TTY

    Language:Python503707
  • GianfrancoCorrea/gpt-4-vision-chat

    GPT 4 Turbo Vision with Chainlit

    Language:Python31245
  • supershaneski/chatgpt-with-image-sample

    This sample project integrates OpenAI's GPT-4 Vision, with advanced image recognition capabilities, and DALL·E 3, the state-of-the-art image generation model, with the Chat completions API. This powerful combination allows for simultaneous image creation and analysis.

    Language:JavaScript242111
  • Helltar/artific_intellig_bot

    AI Telegram Bot, ChatGPT, Dalle2, Whisper, GPT-4 Vision, Stability AI

    Language:Kotlin22434
  • LazaUK/AOAI-GPT4Vision-Streamlit-SDKv1

    Using Azure OpenAI deployment of GPT-4 Turbo with Vision to analyse out-of-stock situation in a fictitious retail shop.

    Language:Python22306
  • neka-nat/mylangrobot

    Language instructions to mycobot using GPT-4V

    Language:Python18200
  • waseemhnyc/object-detection-openai

    Object detection using Open AI Vision Model

    Language:Python18203
  • jeremy-collins/gpt4v-screenshot-analyzer

    This tool offers an interactive way to analyze and understand your screenshots using OpenAI's GPT-4 Vision API. Capture any part of your screen and engage in a dialogue with ChatGPT to uncover detailed insights, ask follow-up questions, and explore visual data in a user-friendly format.

    Language:Python16214
  • mapluisch/GPT-4-Vision-for-HoloLens

    Capture images with HoloLens and receive descriptive responses from OpenAI's GPT-4V(ision).

    Language:ShaderLab14202
  • mickymultani/GPT-4-Vision-Architecture-Scanner

    A web-based tool that utilizes GPT-4's vision capabilities to analyze and describe system architecture diagrams, providing instant insights and detailed breakdowns in an interactive chat interface.

    Language:JavaScript14322
  • scalable-dynamics/gpt-spa

    A customizable GPT in a single page, using OpenAI models text-embedding-ada-002, tts-1, whisper-1, dall-e-3, and gpt-4-vision-preview

    Language:JavaScript14113
  • komzweb/nextjs-gpt4v

    A simple chat app with vision using Next.js, Vercel AI SDK, and GPT-4V.

    Language:TypeScript13207
  • philfung/awesome-computer-use

    Curated resources about automated GUI computer-use via LLMs. Highly opinionated, focus is on quality vs quantity.

  • kornia/pixie

    Pixie: Computer Vision AI Engineer assistant

  • reidbarber/gen-ui

    Use text or image prompts to generate components and apps built with React.

    Language:TypeScript113113
  • walimorris/opensquare

    OSINT Platform - Provides image analysis, digital footprints, video transcription and more. Retrieval Augmented Generation (RAG) capable platform

    Language:Java11001
  • wfce/ChatGPT-OpenAI-API

    全网最低价的OpenAI ChatGPT-4-32K、ChatGPT-3.5 API 最高低于官方价42倍。The lowest-priced OpenAI ChatGPT-4-32K and ChatGPT-3.5 APIs on the entire network are 42 times lower than the official price.

  • jacobmarks/gpt4-vision-plugin

    Chat with your images using GPT-4 Vision!

    Language:Python9303