(AAAI 2024) BLIVA: A Simple Multimodal LLM for Better Handling of Text-rich Visual Questions
Primary LanguagePythonBSD 3-Clause "New" or "Revised" LicenseBSD-3-Clause