"Bringing Image Scene Structure to Video via Frame-Clip Consistency of Object Tokens”, Ben-Avraham et al., NeurIPS 2022
Primary LanguagePythonApache License 2.0Apache-2.0