Toward Interactive Regional Understanding in Vision-Large Language Models (NAACL 2024)
Primary LanguagePythonBSD 3-Clause "New" or "Revised" LicenseBSD-3-Clause