- ScanQA: 3D Question Answering for Spatial Scene Understanding
- SQA3D: SITUATED QUESTION ANSWERING IN 3D SCENES
- Generating Context-Aware Natural Answers for Questions in 3D Scenes
- Chat-3D: Data-efficiently Tuning Large Language Model for Universal Dialogue of 3D Scenes
- CLIP-Guided Vision-Language Pre-training for Question Answering in 3D Scenes