/SViT

"Bringing Image Scene Structure to Video via Frame-Clip Consistency of Object Tokens”, Ben-Avraham et al., NeurIPS 2022

Primary LanguagePythonApache License 2.0Apache-2.0

Watchers