video-language-pretraining

There are 7 repositories under video-language-pretraining topic.

DAMO-NLP-SG/Video-LLaMA
[EMNLP 2023 Demo] Video-LLaMA: An Instruction-tuned Audio-Visual Language Model for Video Understanding
Language:Python2.9k 33 158265
bytedance/Shot2Story
A new multi-shot video understanding benchmark Shot2Story with comprehensive video summaries and detailed shot-level captions.
Language:Python106 6 176
XLearning-SCU/2024-ICLR-Norton
Multi-granularity Correspondence Learning from Long-term Noisy Videos [ICLR 2024, Oral]
Language:Python97 11 18
bigai-nlco/VideoLLaMB
Official Repository of VideoLLaMB: Long Video Understanding with Recurrent Memory Bridges
Language:Python59 3 72
liveseongho/Awesome-Video-Language-Understanding
A Survey on video and language understanding.
48 1 02
SCZwangxiao/RTQ-MM2023
ACM Multimedia 2023 (Oral) - RTQ: Rethinking Video-language Understanding Based on Image-text Model
Language:Python16 4 53
Maddy12/SSL4VideoSurvey
The official GitHub page for the survey paper "Self-Supervised learning for Videos: A survey"
4 2 00