yuanrr

Ph.D student, focusing on image and video understanding, i.e., visual question answering, video question answering, etc.

Pinned Repositories

LGVA_VideoQA
Language-Guided Visual Aggregation for Video Question Answering
Language:Python4 5 122
Flipped-VQA
Large Language Models are Temporal and Causal Reasoners for Video Question Answering (EMNLP 2023)
Language:Python61 6 197
Self-PT
Self-PT: Adaptive Self-Prompt Tuning for Low-Resource Visual Question Answering (ACM MM 2023)
Language:Python60
Ask-Anything
[CVPR2024 Highlight][VideoChatGPT] ChatGPT with video understanding! And many more supported LMs such as miniGPT4, StableLM, and MOSS.
Language:Python2.8k 37 184230
logit-standardization-KD
[CVPR 2024 Highlight] Logit Standardization in Knowledge Distillation
Language:Jupyter Notebook253 5 136
CoMa
Language:Python00
ICMRSS
Knowledge-Driven Analysis and Retrieval on Multimedia.
Language:HTML0 2 00
Self-PT
Self-PT: Adaptive Self-Prompt Tuning for Low-Resource Visual Question Answering (ACM MM' 23)
00
SEMA
SEMA: Semantic Distance Adversarial Learning for Text-to-Image Synthesis (TMM' 23)
Language:Python1 1 00
UCT
UCT: Unbiased Feature Learning with Causal Intervention for Visible-Infrared Person Re-identification (Under review)
00

yuanrr/SEMA
SEMA: Semantic Distance Adversarial Learning for Text-to-Image Synthesis (TMM' 23)
Language:Python1 1 00
yuanrr/CoMa
Language:Python00
yuanrr/ICMRSS
Knowledge-Driven Analysis and Retrieval on Multimedia.
Language:HTML0 2 00
yuanrr/Self-PT
Self-PT: Adaptive Self-Prompt Tuning for Low-Resource Visual Question Answering (ACM MM' 23)
00
yuanrr/UCT
UCT: Unbiased Feature Learning with Causal Intervention for Visible-Infrared Person Re-identification (Under review)
00