large-vision-language-model

There are 15 repositories under large-vision-language-model topic.

BradyFU/Awesome-Multimodal-Large-Language-Models
:sparkles::sparkles:Latest Advances on Multimodal Large Language Models
16.3k 282 1481.1k
PKU-YuanGroup/Video-LLaVA
【EMNLP 2024🔥】Video-LLaVA: Learning United Visual Representation by Alignment Before Projection
Language:Python3.4k 30 209240
InternLM/InternLM-XComposer
InternLM-XComposer2.5-OmniLive: A Comprehensive Multimodal System for Long-term Streaming Video and Audio Interactions
Language:Python2.9k 43 436177
PKU-YuanGroup/MoE-LLaVA
【TMM 2025🔥】 Mixture-of-Experts for Large Vision-Language Models
Language:Python2.2k 23 97141
yaotingwangofficial/Awesome-MCoT
Multimodal Chain-of-Thought Reasoning: A Comprehensive Survey
808 13 1124
jqtangust/hawk
🔥 🔥 🔥 [NeurIPS 2024] Official Implementation of Hawk: Learning to Understand Open-World Video Anomalies
Language:Python218 5 51
MMStar-Benchmark/MMStar
[NeurIPS 2024] This repo contains evaluation code for the paper "Are We on the Right Way for Evaluating Large Vision-Language Models"
Language:Python194 1 135
yu-rp/apiprompting
[ECCV 2024] API: Attention Prompting on Image for Large Vision-Language Models
Language:Python102 1 116
richard-peng-xia/CARES
[NeurIPS'24] CARES: A Comprehensive Benchmark of Trustworthiness in Medical Vision Language Models
Language:Python75 4 46
SuperBruceJia/Awesome-Large-Vision-Language-Model
Awesome Large Vision-Language Model: A Curated List of Large Vision-Language Model
36 1 24
Ruiyang-061X/VL-Uncertainty
🔎Official code for our paper: "VL-Uncertainty: Detecting Hallucination in Large Vision-Language Model via Uncertainty Estimation".
Language:Python31 2 52
ADL-X/LLAVIDAL
This is the offical repository of LLAVIDAL
Language:Python13 1 31
ai4ce/LUWA
[CVPR 2024 Highlight] The first benchmark for lithic use-wear analysis leveraging SOTA vision and vision-language models (DINOv2, GPT-4V), demonstrating AI performance surpassing that of expert archaeologists.
Language:Jupyter Notebook5 4 00
lucaswychan/quant-lvlm
Easy-to-use large vision language model pipeline for quantitative analysis
Language:Python2 1 00
amazon-science/THRONE
Code release for THRONE, a CVPR 2024 paper on measuring object hallucinations in LVLM generated text.
Language:Python

large-vision-language-model

BradyFU/Awesome-Multimodal-Large-Language-Models

PKU-YuanGroup/Video-LLaVA

InternLM/InternLM-XComposer

PKU-YuanGroup/MoE-LLaVA

yaotingwangofficial/Awesome-MCoT

jqtangust/hawk

MMStar-Benchmark/MMStar

yu-rp/apiprompting

richard-peng-xia/CARES

SuperBruceJia/Awesome-Large-Vision-Language-Model

Ruiyang-061X/VL-Uncertainty

ADL-X/LLAVIDAL

ai4ce/LUWA

lucaswychan/quant-lvlm

amazon-science/THRONE