/visual-spatial-reasoning

[TACL'23] VSR: A probing benchmark for spatial undersranding of vision-language models.

Primary LanguagePythonApache License 2.0Apache-2.0

Watchers