/UTA

Enhancing Vision-Language Model with Unmasked Token Alignment (TMLR)

Primary LanguagePythonMIT LicenseMIT

Stargazers