[ACM MM 2024] Hierarchical Multimodal Fine-grained Modulation for Visual Grounding.
Primary LanguagePythonApache License 2.0Apache-2.0