The code for "VideoRefer Suite: Advancing Spatial-Temporal Object Understanding with Video LLM"
Primary LanguagePython