/Video-LLaMA

[EMNLP 2023 Demo] Video-LLaMA: An Instruction-tuned Audio-Visual Language Model for Video Understanding

Primary LanguagePythonBSD 3-Clause "New" or "Revised" LicenseBSD-3-Clause

Issues