grid-beep

grid-beep's Stars

heshuting555/DsHmp
[CVPR-2024] Decoupling Static and Hierarchical Motion Perception for Referring Video Segmentation
Language:Python76
CompVis/stable-diffusion
A latent text-to-image diffusion model
Language:Jupyter Notebook68k10.1k
kdwonn/SaG
Official repository of the "Shatter and Gather: Learning Referring Image Segmentation with Text Supervision (ICCV'23)"
Language:Python332
mbzuai-oryx/Video-ChatGPT
[ACL 2024 🔥] Video-ChatGPT is a video conversation model capable of generating meaningful conversation about videos. It combines the capabilities of LLMs with a pretrained visual encoder adapted for spatiotemporal video representation. We also introduce a rigorous 'Quantitative Evaluation Benchmarking' for video-based conversational models.
Language:Python1.2k106