3D_Visual_Grounding

Overview

Datasets

  1. ReferIt3D: Neural Listeners for Fine-Grained 3D Object Identification in Real-World Scenes, Stanford University, ECCV 2020 Oral [Project] [Paper] [Code]

  2. ScanRefer: 3D Object Localization in RGB-D Scans using Natural Language, Technical University of Munich, ECCV 2020 [Project] [Paper] [Code]

Paper Roadmap (Chronological Order):

  1. Refer-it-in-RGBD: A Bottom-up Approach for 3D Visual Grounding in RGBD Images, Shenzhen Research Institute of Big Data, CUHK-Shenzhen, CVPR 2021 [Project] [Paper] [Code]

  2. Free-form Description Guided 3D Visual Graph Network for Object Grounding in Point Cloud, Xidian University, ICCV 2021 [Paper] [Code]

  3. InstanceRefer: Cooperative Holistic Understanding for Visual Grounding on Point Clouds through Instance Multi-level Contextual Referring, The Chinese University of Hong Kong (Shenzhen), ICCV 2021 [Paper] [Code]

  4. 3DVG-Transformer: Relation Modeling for Visual Grounding on Point Clouds, College of Software, Beihang University, ICCV 2021 [Paper] [Code]

  5. SAT: 2D Semantics Assisted Training for 3D Visual Grounding, University of Rochester, ICCV 2021, Oral [Paper] [Code]

  6. TransRefer3D: Entity-and-Relation Aware Transformer for Fine-Grained 3D Visual Grounding, School of Computer Science and Engineering, Beihang University, ACM MM 2021 [Paper] [Code]

  7. LanguageRefer: Spatial-Language Model for 3D Visual Grounding, University of Washington, CoRL 2021 [Paper] [Code]

  8. 3D-SPS: Single-Stage 3D Visual Grounding via Referred Point Progressive Selection, Institute of Artificial Intelligence, Beihang University, CVPR 2022, Oral [Paper] [Code]

  9. Multi-View Transformer for 3D Visual Grounding, The Chinese University of Hong Kong, CVPR 2022 [Paper] [Code]

  10. Language Conditioned Spatial Relation Reasoning for 3D Object Grounding, Inria, École normale supérieure, CNRS, PSL Research University,, NeurIPS 2022 [Paper] [Code]

  11. Look Around and Refer: 2D Synthetic Semantics Knowledge Distillation for 3D Visual Grounding, King Abdullah University of Science and Technology, NeurIPS 2022 [Paper] [Code]

  12. EDA: Explicit Text-Decoupling and Dense Alignment for 3D Visual Grounding, Shenzhen Graduate School, Peking University, CVPR 2023 [Paper] [Code]

  13. Language-Assisted 3D Feature Learning for Semantic Scene Understanding, Tsinghua University, AAAI 2023, Oral [Paper] [Code]

  14. NS3D: Neuro-Symbolic Grounding of 3D Objects and Relations, Stanford University, MIT, CVPR 2023 [Paper]

  15. Distilling Coarse-to-Fine Semantic Matching Knowledge for Weakly Supervised 3D Visual Grounding, Zhejiang University, ICCV 2023 [Paper]

  16. ViewRefer: Grasp the Multi-view Knowledge for 3D Visual Grounding with GPT and Prototype Guidance, Shanghai Artificial Intelligence Laboratory [Paper]

  17. 3DRP-Net: 3D Relative Position-aware Network for 3D Visual Grounding, Zhejiang University [Paper]

  18. A Unified Framework for 3D Point Cloud Visual Grounding, Xiamen University [Paper][Code]

Extension

Captioning & Grounding

  1. D3Net: A Unified Speaker-Listener Architecture for 3D Dense Captioning and Visual Grounding, Technical University of Munich, ECCV 2022 [Paper] [Code]

  2. 3DJCG: A Unified Framework for Joint Dense Captioning and Visual Grounding on 3D Point Clouds, Beihang University, CVPR 2022 [Paper] [Code]