/Flipped-VQA

Large Language Models are Temporal and Causal Reasoners for Video Question Answering (EMNLP 2023)

Primary LanguagePythonMIT LicenseMIT

Issues