GroundVLP: Harnessing Zero-shot Visual Grounding from Vision-Language Pre-training and Open-Vocabulary Object Detection (AAAI 2024)
Primary LanguageJupyter NotebookApache License 2.0Apache-2.0