yuezih's Stars
zhaoyue-zephyrus/AVION
[arXiv:2309.16669] Code release for "Training a Large Video Model on a Single Machine in a Day"
bdaiinstitute/theia
Theia: Distilling Diverse Vision Foundation Models for Robot Learning
zhangyikaii/Proto-CAT
The code repository for "Audio-Visual Generalized Few-Shot Learning with Prototype-Based Co-Adaptation"
zhangyikaii/LAMDA-ZhiJian
ZhiJian: A Unifying and Rapidly Deployable Toolbox for Pre-trained Model Reuse
FlagOpen/FlagScale
FlagScale is a large model toolkit based on open-sourced projects.
yuezih/SMILE
Learning Descriptive Image Captioning via Semipermeable Maximum Likelihood Estimation (NeurIPS 2023)
ylwhxht/SRKD-DRET
AAAI2024 - Sunshine to Rainstorm: Cross-Weather Knowledge Distillation for Robust 3D Object Detection
yuezih/less-is-more
Less is More: Mitigating Multimodal Hallucination from an EOS Decision Perspective (ACL 2024)
yuezih/Movie101
Narrative movie understanding benchmark
showlab/Awesome-MLLM-Hallucination
📖 A curated list of resources dedicated to hallucination of multimodal large language models (MLLM).
X-PLUG/mPLUG-DocOwl
mPLUG-DocOwl: Modularized Multimodal Large Language Model for Document Understanding
OpenGVLab/Ask-Anything
[CVPR2024 Highlight][VideoChatGPT] ChatGPT with video understanding! And many more supported LMs such as miniGPT4, StableLM, and MOSS.
EricLee8/MPD_EMVI
Official implementation of our paper at ACL 2023: Pre-training Multi-party Dialogue Models with Latent Discourse Inference
ccfddl/ccf-deadlines
⏰ Collaboratively track deadlines of conferences recommended by CCF (Website, Python Cli, Wechat Applet) / If you find it useful, please star this project, thanks~
YixunLiang/ReTR
Official code of ReTR (NeurIPS 2023)
yaolinli/CapEnrich
haotian-liu/LLaVA
[NeurIPS'23 Oral] Visual Instruction Tuning (LLaVA) built towards GPT-4V level capabilities and beyond.
voxel51/fiftyone
Refine high-quality datasets and visual AI models
ML-GSAI/DPT
Official PyTorch implementation for "Diffusion Models and Semi-Supervised Learners Benefit Mutually with Few Labels"
xieyuquanxx/awesome-Large-MultiModal-Hallucination
😎 up-to-date & curated list of awesome LMM hallucinations papers, methods & resources.
eric-ai-lab/awesome-vision-language-navigation
A curated list for vision-and-language navigation. ACL 2022 paper "Vision-and-Language Navigation: A Survey of Tasks, Methods, and Future Directions"
EnVision-Research/LucidDreamer
Official implementation of "LucidDreamer: Towards High-Fidelity Text-to-3D Generation via Interval Score Matching"
DarkHighness/opendigger-cli
bcbi-edu/p_eickhoff_isoscore
TideDancer/iclr21_isotropy_contxt
ainagari/monopoly
wtimkey/rogue-dimensions
replication code for EMNLP 2021 paper
diff-usion/Awesome-Diffusion-Models
A collection of resources and papers on Diffusion Models
wangkai930418/awesome-diffusion-categorized
collection of diffusion model papers categorized by their subareas
THUDM/CogVLM
a state-of-the-art-level open visual language model | 多模态预训练模型