/GVL

Official implementation for paper Learning Grounded Vision-Language Representation for Versatile Understanding in Untrimmed Videos

Primary LanguagePythonMIT LicenseMIT