OliverLeeXZ

Setbacks and farewells are but embellishments in life.

Tongji UniversityShanghai

Pinned Repositories

Video-MME
✨✨Video-MME: The First-Ever Comprehensive Evaluation Benchmark of Multi-modal LLMs in Video Analysis
439 6 3318
LLaVA
[NeurIPS'23 Oral] Visual Instruction Tuning (LLaVA) built towards GPT-4V level capabilities and beyond.
Language:Python21k 158 1.6k2.3k
llama
Inference code for Llama models
Language:Python57.2k 526 1.1k9.7k
stable-diffusion-videos
Create 🔥 videos with Stable Diffusion by exploring the latent space and morphing between text prompts
Language:Python4.5k 56 122430
Awesome-MLLM-Hallucination-and-Alignment
Recent works about (M)LLM hallucination and alignment.
20
OliverLeeXZ.github.io
Language:HTML00
Video-LLaVA
【EMNLP 2024🔥】Video-LLaVA: Learning United Visual Representation by Alignment Before Projection
Language:Python3.1k 29 202220
Qwen-VL
The official repo of Qwen-VL (通义千问-VL) chat & pretrained large vision language model proposed by Alibaba Cloud.
Language:Python5.3k 50 456403
OPERA
[CVPR 2024 Highlight] OPERA: Alleviating Hallucination in Multi-Modal Large Language Models via Over-Trust Penalty and Retrospection-Allocation
Language:Python301 3 4826
HallusionBench
[CVPR'24] HallusionBench: You See What You Think? Or You Think What You See? An Image-Context Reasoning Benchmark Challenging for GPT-4V(ision), LLaVA-1.5, and Other Multi-modality Models
Language:Python265 4 117