gpt-4v

There are 23 repositories under gpt-4v topic.

OpenGVLab/InternVL
[CVPR 2024 Oral] InternVL Family: A Pioneering Open-Source Alternative to GPT-4o. 接近GPT-4o表现的开源多模态对话模型
Language:Python6.6k 56 690513
open-compass/VLMEvalKit
Open-source evaluation toolkit of large vision-language models (LVLMs), support 160+ VLMs, 50+ benchmarks
Language:Python1.6k 11 254221
ShareGPT4Omni/ShareGPT4Video
[NeurIPS 2024] An official implementation of ShareGPT4Video: Improving Video Understanding and Generation with Better Captions
Language:Python1.3k 32 3944
tianyi-lab/HallusionBench
[CVPR'24] HallusionBench: You See What You Think? Or You Think What You See? An Image-Context Reasoning Benchmark Challenging for GPT-4V(ision), LLaVA-1.5, and Other Multi-modality Models
Language:Python263 4 117
RLHF-V/RLAIF-V
RLAIF-V: Aligning MLLMs through Open-Source AI Feedback for Super GPT-4V Trustworthiness
Language:Python262 6 3610
davideuler/awesome-assistant-api
Try openai assistant api apps on Google Colab for free. Awesome assistant API Demos!
Language:Jupyter Notebook210 6 123
ShareGPT4Omni/ShareGPT4V
[ECCV 2024] ShareGPT4V: Improving Large Multi-modal Models with Better Captions
Language:Python179 3 174
yachty66/gpt_pdf_md
🚀 gpt_pdf_md: Convert PDF to Markdown with GPT-4V & more. Extract images, upload to Google Cloud, & generate Markdown with images. Python, GPT-4V Vision, Scala. Ideal for developers, researchers. PDF to Markdown, GPT-4V, image extraction, Python package
Language:Scala77 4 02
lofcz/LlmTornado
One .NET library to consume OpenAI, Anthropic, Cohere, Google, Azure, Groq, and self-hosed APIs.
Language:C#41 6 1514
jameszhou-gl/gpt-4v-distribution-shift
Code for ICLR'24 workshop ME-FoMo-How Well Does GPT-4V(ision) Adapt to Distribution Shifts? A Preliminary Investigation
Language:Jupyter Notebook33 2 32
roboflow/gpt-checkup
Monitor the performance of OpenAI's GPT-4V model over time.
Language:HTML31 6 15
taogoddd/GPT-4V-API
Self-hosted GPT-4V api
Language:JavaScript29 1 10
autodistill/autodistill-gpt-4v
GPT-4V(ision) module for use with Autodistill.
Language:Python26 5 36
logicalroot/gpt-4v-demos
🤖 GPT-4V Demos • Test the model's vision capabilities in your browser using Streamlit • Easy setup
Language:Python17 2 04
android-com-pl/wp-ai-alt-generator
WordPress plugin that leverages OpenAI's Vision API to automatically generate descriptive alt text for images, enhancing accessibility and SEO.
Language:TypeScript12 0 34
ShareGPT4Omni/ShareGPT4Omni
ShareGPT4Omni: Towards Building Omni Large Multi-modal Models with Comprehensive Multi-modal Annotations
8 2 00
afonso07/ruskin
Your own personal Ruskin.
Language:TypeScript6 1 01
aymenfurter/copilot-insurance-claim-demo
How a Picture of Car Damage Can File Your Insurance Claim
Language:Java6 2 02
aymenfurter/azure-chat-with-your-photos-demo
Chatbot that comprehends uploaded images and engages in detailed conversations about their content.
Language:Bicep3 2 00
gutbash/lmm-graph-vision
How well do the GPT-4V, Gemini Pro Vision, and Claude 3 Opus models perform zero-shot vision tasks on data structures?
Language:Python3 3 71
metatatt/003-wireDiagramReader
Wiring Diagram Reader: Use GPT-4V to interpret electrical diagrams. Simplifying complex schematics for seamless high-level understanding.
Language:TypeScript2 1 00
ndurner/oai_chat
Multi-modal Chatbot based on OpenAI
Language:Python2 3 00
zaidmukaddam/nmap-vision
NMAP Scan Analysis powered by GPT-4V and GPT-4 Turbo!
Language:TypeScript1 1 01

gpt-4v

OpenGVLab/InternVL

open-compass/VLMEvalKit

ShareGPT4Omni/ShareGPT4Video

tianyi-lab/HallusionBench

RLHF-V/RLAIF-V

davideuler/awesome-assistant-api

ShareGPT4Omni/ShareGPT4V

yachty66/gpt_pdf_md

lofcz/LlmTornado

jameszhou-gl/gpt-4v-distribution-shift

roboflow/gpt-checkup

taogoddd/GPT-4V-API

autodistill/autodistill-gpt-4v

logicalroot/gpt-4v-demos

android-com-pl/wp-ai-alt-generator

ShareGPT4Omni/ShareGPT4Omni

afonso07/ruskin

aymenfurter/copilot-insurance-claim-demo

aymenfurter/azure-chat-with-your-photos-demo

gutbash/lmm-graph-vision

metatatt/003-wireDiagramReader

ndurner/oai_chat

zaidmukaddam/nmap-vision