gpt-4v
There are 23 repositories under gpt-4v topic.
OpenGVLab/InternVL
[CVPR 2024 Oral] InternVL Family: A Pioneering Open-Source Alternative to GPT-4o. 接近GPT-4o表现的开源多模态对话模型
open-compass/VLMEvalKit
Open-source evaluation toolkit of large vision-language models (LVLMs), support 160+ VLMs, 50+ benchmarks
ShareGPT4Omni/ShareGPT4Video
[NeurIPS 2024] An official implementation of ShareGPT4Video: Improving Video Understanding and Generation with Better Captions
tianyi-lab/HallusionBench
[CVPR'24] HallusionBench: You See What You Think? Or You Think What You See? An Image-Context Reasoning Benchmark Challenging for GPT-4V(ision), LLaVA-1.5, and Other Multi-modality Models
RLHF-V/RLAIF-V
RLAIF-V: Aligning MLLMs through Open-Source AI Feedback for Super GPT-4V Trustworthiness
davideuler/awesome-assistant-api
Try openai assistant api apps on Google Colab for free. Awesome assistant API Demos!
ShareGPT4Omni/ShareGPT4V
[ECCV 2024] ShareGPT4V: Improving Large Multi-modal Models with Better Captions
yachty66/gpt_pdf_md
🚀 gpt_pdf_md: Convert PDF to Markdown with GPT-4V & more. Extract images, upload to Google Cloud, & generate Markdown with images. Python, GPT-4V Vision, Scala. Ideal for developers, researchers. PDF to Markdown, GPT-4V, image extraction, Python package
lofcz/LlmTornado
One .NET library to consume OpenAI, Anthropic, Cohere, Google, Azure, Groq, and self-hosed APIs.
jameszhou-gl/gpt-4v-distribution-shift
Code for ICLR'24 workshop ME-FoMo-How Well Does GPT-4V(ision) Adapt to Distribution Shifts? A Preliminary Investigation
roboflow/gpt-checkup
Monitor the performance of OpenAI's GPT-4V model over time.
taogoddd/GPT-4V-API
Self-hosted GPT-4V api
autodistill/autodistill-gpt-4v
GPT-4V(ision) module for use with Autodistill.
logicalroot/gpt-4v-demos
🤖 GPT-4V Demos • Test the model's vision capabilities in your browser using Streamlit • Easy setup
android-com-pl/wp-ai-alt-generator
WordPress plugin that leverages OpenAI's Vision API to automatically generate descriptive alt text for images, enhancing accessibility and SEO.
ShareGPT4Omni/ShareGPT4Omni
ShareGPT4Omni: Towards Building Omni Large Multi-modal Models with Comprehensive Multi-modal Annotations
afonso07/ruskin
Your own personal Ruskin.
aymenfurter/copilot-insurance-claim-demo
How a Picture of Car Damage Can File Your Insurance Claim
aymenfurter/azure-chat-with-your-photos-demo
Chatbot that comprehends uploaded images and engages in detailed conversations about their content.
gutbash/lmm-graph-vision
How well do the GPT-4V, Gemini Pro Vision, and Claude 3 Opus models perform zero-shot vision tasks on data structures?
metatatt/003-wireDiagramReader
Wiring Diagram Reader: Use GPT-4V to interpret electrical diagrams. Simplifying complex schematics for seamless high-level understanding.
ndurner/oai_chat
Multi-modal Chatbot based on OpenAI
zaidmukaddam/nmap-vision
NMAP Scan Analysis powered by GPT-4V and GPT-4 Turbo!