video-question-answering
There are 48 repositories under video-question-answering topic.
OpenGVLab/Ask-Anything
[CVPR2024 Highlight][VideoChatGPT] ChatGPT with video understanding! And many more supported LMs such as miniGPT4, StableLM, and MOSS.
OpenGVLab/InternVideo
[ECCV2024] Video Foundation Models & Data for Multimodal Understanding
jayleicn/ClipBERT
[CVPR 2021 Best Student Paper Honorable Mention, Oral] Official PyTorch code for ClipBERT, an efficient framework for end-to-end learning on image-text and video-text tasks.
Vision-CAIR/MiniGPT4-video
Official code for Goldfish model for long video understanding and MiniGPT4-video for short video understanding
X-PLUG/Youku-mPLUG
Youku-mPLUG: A 10 Million Large-scale Chinese Video-Language Pre-training Dataset and Benchmarks
X-PLUG/mPLUG-2
mPLUG-2: A Modularized Multi-modal Foundation Model Across Text, Image and Video (ICML 2023)
apple/ml-slowfast-llava
SlowFast-LLaVA: A Strong Training-Free Baseline for Video Large Language Models
salesforce/ALPRO
Align and Prompt: Video-and-Language Pre-training with Entity Prompts
Yui010206/SeViLA
[NeurIPS 2023] Self-Chained Image-Language Model for Video Localization and Question Answering
antoyang/FrozenBiLM
[NeurIPS 2022] Zero-Shot Video Question Answering via Frozen Bidirectional Language Models
doc-doc/NExT-QA
NExT-QA: Next Phase of Question-Answering to Explaining Temporal Actions (CVPR'21)
tsujuifu/pytorch_violet
A PyTorch implementation of VIOLET
jayleicn/TVQAplus
[ACL 2020] PyTorch code for TVQA+: Spatio-Temporal Grounding for Video Question Answering
jpthu17/EMCL
[NeurIPS 2022 Spotlight] Expectation-Maximization Contrastive Learning for Compact Video-and-Language Representations
antoyang/just-ask
[ICCV 2021 Oral + TPAMI] Just Ask: Learning to Answer Questions from Millions of Narrated Videos
jpthu17/HBI
[CVPR 2023 Highlight & TPAMI] Video-Text as Game Players: Hierarchical Banzhaf Interaction for Cross-Modal Representation Learning
bytedance/Shot2Story
A new multi-shot video understanding benchmark Shot2Story with comprehensive video summaries and detailed shot-level captions.
mlvlab/Flipped-VQA
Large Language Models are Temporal and Causal Reasoners for Video Question Answering (EMNLP 2023)
doc-doc/NExT-GQA
Can I Trust Your Answer? Visually Grounded Video Question Answering (CVPR'24, Highlight)
whwu95/FreeVA
FreeVA: Offline MLLM as Training-Free Video Assistant
bcmi/Causal-VidQA
[CVPR 2022] A large-scale public benchmark dataset for video question-answering, especially about evidence and commonsense reasoning. The code used in our paper "From Representation to Reasoning: Towards both Evidence and Commonsense Reasoning for Video Question-Answering", CVPR2022.
sail-sg/VGT
Video Graph Transformer for Video Question Answering (ECCV'22)
zchoi/PKOL
[TIP 2022] Official code of paper “Video Question Answering with Prior Knowledge and Object-sensitive Learning”
tsujuifu/pytorch_empirical-mvm
A PyTorch implementation of EmpiricalMVM
XLiu443/Tem-adapter
[ICCV2023] Tem-adapter: Adapting Image-Text Pretraining for Video Question Answer
doc-doc/HQGA
Video as Conditional Graph Hierarchy for Multi-Granular Question Answering (AAAI'22, Oral)
mlvlab/MELTR
MELTR: Meta Loss Transformer for Learning to Fine-tune Video Foundation Models (CVPR 2023)
noagarcia/knowit-rock
ROCK model for Knowledge-Based VQA in Videos
yl3800/IGV
This repo contains code for Invariant Grounding for Video Question Answering
doc-doc/CoVGT
Contrastive Video Question Answering via Video Graph Transformer (IEEE T-PAMI'23)
noagarcia/ROLL-VideoQA
PyTorch code for ROLL, a knowledge-based video story question answering model.
mlvlab/OVQA
Open-Vocabulary Video Question Answering: A New Benchmark for Evaluating the Generalizability of Video Question Answering Models (ICCV 2023)
zhousheng97/ViTXT-GQA
✨✨ Scene-Text Grounding for Text-Based Video Question Answering (arxiv)
mmazab/LifeQA
Data and PyTorch code for the LifeQA LREC 2020 paper.
declare-lab/Sealing
[NAACL 2024] Official Implementation of paper "Self-Adaptive Sampling for Efficient Video Question Answering on Image--Text Models"
chakravarthi589/Video-Question-Answering_Resources
Video Question Answering | Video QA | VQA