DingchenYang99's Stars
ollama/ollama
Get up and running with Llama 3.3, Mistral, Gemma 2, and other large language models.
justjavac/awesome-wechat-weapp
微信小程序开发资源汇总 :100:
meta-llama/llama3
The official Meta Llama 3 GitHub site
Vision-CAIR/MiniGPT-4
Open-sourced codes for MiniGPT-4 and MiniGPT-v2 (https://minigpt-4.github.io, https://minigpt-v2.github.io/)
haotian-liu/LLaVA
[NeurIPS'23 Oral] Visual Instruction Tuning (LLaVA) built towards GPT-4V level capabilities and beyond.
BradyFU/Awesome-Multimodal-Large-Language-Models
:sparkles::sparkles:Latest Advances on Multimodal Large Language Models
OpenBMB/MiniCPM-V
MiniCPM-V 2.6: A GPT-4V Level MLLM for Single Image, Multi Image and Video on Your Phone
huggingface/trl
Train transformer language models with reinforcement learning.
ShiArthur03/ShiArthur03
salesforce/LAVIS
LAVIS - A One-stop Library for Language-Vision Intelligence
facebookresearch/dinov2
PyTorch code and models for the DINOv2 self-supervised learning method.
sahil280114/codealpaca
tylin/coco-caption
Xwin-LM/Xwin-LM
Xwin-LM: Powerful, Stable, and Reproducible LLM Alignment
PKU-YuanGroup/Chat-UniVi
[CVPR 2024 Highlight🔥] Chat-UniVi: Unified Visual Representation Empowers Large Language Models with Image and Video Understanding
git-cloner/aliendao
huggingface mirror download
voidism/DoLa
Official implementation for the paper "DoLa: Decoding by Contrasting Layers Improves Factuality in Large Language Models"
shikiw/OPERA
[CVPR 2024 Highlight] OPERA: Alleviating Hallucination in Multi-Modal Large Language Models via Over-Trust Penalty and Retrospection-Allocation
tsb0601/MMVP
OpenDriveLab/ViDAR
[CVPR 2024 Highlight] Visual Point Cloud Forecasting
RLHF-V/RLAIF-V
RLAIF-V: Aligning MLLMs through Open-Source AI Feedback for Super GPT-4V Trustworthiness
DAMO-NLP-SG/VCD
[CVPR 2024 Highlight] Mitigating Object Hallucinations in Large Vision-Language Models through Visual Contrastive Decoding
ispc-lab/LiDAR4D
💫 [CVPR 2024] LiDAR4D: Dynamic Neural Fields for Novel Space-time View LiDAR Synthesis
MBZUAI-LLM/web2code
Web2Code: A Large-scale Webpage-to-Code Dataset and Evaluation Framework for Multimodal LLMs
Beckschen/LLaVolta
[NeurIPS 2024] Efficient Multi-modal Models via Stage-wise Visual Context Compression
Jiaxuan-Li/EVCap
[CVPR 2024] Retrieval-Augmented Image Captioning with External Visual-Name Memory for Open-World Comprehension
bcdnlp/FAITHSCORE
likaixin2000/MMCode
[EMNLP 2024] Multi-modal reasoning problems via code generation.
DingchenYang99/Pensieve
The official repo of our work "Pensieve: Retrospect-then-Compare mitigates Visual Hallucination"
HQHBench/HQHBench
The official Github page for "Evaluating the Quality of Hallucination Benchmarks for Large Vision-Language Models"