/MSGT

[TMM'23]Multi-modal Structure-embedding Graph Transformer for Visual Commonsense Reasoning

Primary LanguagePythonMIT LicenseMIT

Stargazers