PyTorch Implementation of "V* : Guided Visual Search as a Core Mechanism in Multimodal LLMs"
Primary LanguagePythonMIT LicenseMIT