hanmenghan's Stars
OpenBMB/ChatDev
Create Customized Software using Natural Language Idea (through LLM-powered Multi-Agent Collaboration)
salesforce/LAVIS
LAVIS - A One-stop Library for Language-Vision Intelligence
google/gemma_pytorch
The official PyTorch implementation of Google's Gemma models
openai/transformer-debugger
open-compass/VLMEvalKit
Open-source evaluation toolkit of large vision-language models (LVLMs), support ~100 VLMs, 40+ benchmarks
openai/automated-interpretability
showlab/Awesome-MLLM-Hallucination
📖 A curated list of resources dedicated to hallucination of multimodal large language models (MLLM).
mpSchrader/gym-sokoban
Sokoban environment for OpenAI Gym
SHI-Labs/VCoder
VCoder: Versatile Vision Encoders for Multimodal Large Language Models, arXiv 2023 / CVPR 2024
pkunlp-icler/FastV
[ECCV 2024 Oral] Code for paper: An Image is Worth 1/2 Tokens After Layer 2: Plug-and-Play Inference Acceleration for Large Vision-Language Models
hanmenghan/TMC
The project page of paper: Trusted Multi-View Classification [ICLR'2021 paper]
SkyworkAI/agent-studio
Benchmarks, environments, and toolkits for general computer agents
showlab/Awesome-GUI-Agent
💻 A curated list of papers and resources for multi-modal Graphical User Interface (GUI) agents.
BillChan226/HALC
[ICML 2024] Official implementation for "HALC: Object Hallucination Reduction via Adaptive Focal-Contrast Decoding"
QingyangZhang/QMF
Quality-aware multimodal fusion on ICML 2023
jiajunsi/RCML
Reliable Conflictive Multi-view Learning
Stanford-AIMI/chexpert-plus
ZhaofengWu/counterfactual-evaluation
hamidkazemi22/CLIPInversion
What do we learn from inverting CLIP models?
QinYang79/Awesome-Noisy-Correspondence
This is a summary of research on noisy correspondence. There may be omissions. If anything is missing please get in touch with us. Our emails: linyijie.gm@gmail.com yangmouxing@gmail.com qinyang.gm@gmail.com
QingyangZhang/awesome-low-quality-multimodal-learning
XLearning-SCU/2024-ICLR-READ
Pytorch implementation of "Test-time Adaption against Multi-modal Reliability Bias".
showlab/assistgui
ycfate/ID-like
ID-like Prompt Learning for Few-Shot Out-of-Distribution Detection
Cocofeat/EyeMoSt
【MICCAI 2023 Early Accept & MedIA submission】EyeMost "Reliable Multimodality Eye Disease Screening via Mixture of Student's t Distributions"
hanmenghan/Skip-n
This repository contains the code of our paper 'Skip \n: A simple method to reduce hallucination in Large Vision-Language Models'.
mit-acl/gym-minigrid
Cocofeat/MedRG
MedRG: Medical Report Grounding with Multi-modal Large Language Model
timqqt/Fair_Text_based_Image_Retrieval
hanmenghan/Awesome-MLLM-Hallucination
📖 A curated list of resources dedicated to hallucination of multimodal large language models (MLLM).