Official Implementation for LLaMA-VID: An Image is Worth 2 Tokens in Large Language Models
Primary LanguagePythonApache License 2.0Apache-2.0
No one’s star this repository yet.