Pinned Repositories
sam2
The repository provides code for running inference with the Meta Segment Anything Model 2 (SAM 2), links for downloading the trained model checkpoints, and example notebooks that show how to use the model.
Sa2VA
🔥 Sa2VA: Marrying SAM2 with LLaVA for Dense Grounded Understanding of Images and Videos
LITA
MultiHop-EgoQA
[AAAI 2025] Grounded Multi-Hop VideoQA in Long-Form Egocentric Videos
Qwen2.5-VL
Qwen2.5-VL is the multimodal large language model series developed by Qwen team, Alibaba Cloud.
VideoLISA
[NeurlPS 2024] One Token to Seg Them All: Language Instructed Reasoning Segmentation in Videos
qirui-chen's Repositories
qirui-chen/MultiHop-EgoQA
[AAAI 2025] Grounded Multi-Hop VideoQA in Long-Form Egocentric Videos