[Under review] Assessing and Learning Alignment of Unimodal Vision and Language Models
Primary LanguageJupyter Notebook