Code for the paper "Textual supervision for visually grounded spoken language understanding".
Primary LanguagePythonApache License 2.0Apache-2.0