/AL-Ref-SAM2

AL-Ref-SAM 2: Unleashing the Temporal-Spatial Reasoning Capacity of GPT for Training-Free Audio and Language Referenced Video Object Segmentation

Primary LanguagePythonMIT LicenseMIT

Watchers