/HA3D_simulator

Official implementation of Human-Aware Vision-and-Language Navigation: Bridging Simulation to Reality with Dynamic Human Interactions (NeurIPS DB Track'24 Spotlight).

Primary LanguageC++

HA3D Simulator

HA3D Simulator integrates 3D human models into real-world environments. Built upon Matterport3D Simulator API and MDM, this simulator offers a robust platform for immersive 3D simulations.

Table of Contents

🔧 Setup Environment

First, set an environment variable to the location of the unzipped dataset. Replace <PATH> with the full absolute path to the directory containing the individual Matterport scan directories.

vim ~/.bashrc
export HA3D_SIMULATOR_DATA_PATH=/your/path/to/store/data
source ~/.bashrc
echo $HA3D_SIMULATOR_DATA_PATH

Expected directory structure:

/your/path/to/store/data
├── data
└── human_motion_meshes

🐍 Create Conda Environment

Set up a Conda environment for the simulator.

conda create --name ha3d_simulator python=3.10
conda activate ha3d_simulator
pip install -r requirements.txt

📥 Download Dataset

To use the simulator, download the Matterport3D Dataset (access required).

python2 download_mp.py -o $HA3D_SIMULATOR_DATA_PATH/dataset --type matterport_skybox_images undistorted_camera_parameters undistorted_depth_images
python scripts/unzip_data.py

🔄 Dataset Preprocessing

Speed up data loading and reduce memory usage by preprocessing the matterport_skybox_images.

./scripts/downsize_skybox.py
./scripts/depth_to_skybox.py

This script downscales and combines all cube faces into a single image, resulting in filenames like <PANO_ID>_skybox_small.jpg. You can download the processed dataset directly from googledrive

🏗️ Build Matterport3D Simulator

Follow the instructions in the Matterport3DSimulator/README.

🚀 Run HA3D Simulator

  1. test

    python driven.py
  2. In a new terminal, start the HA3D Simulator GUI:

    python GUI.py

🕺 Human Motion Generation

Refer to the human_motion_model/README for detailed instructions.

🌆 Annotation

Refer to the human-viewpoint_annotation/README for detailed instructions.

🖥️ Offscreen Rendering

Pyrender supports three backends for offscreen rendering:

  • Pyglet: Requires an active display manager.
  • OSMesa: Software renderer.
  • EGL: GPU-accelerated rendering without a display manager (default).

More details: Pyrender Offscreen Rendering

📊 Training

Train the model using the following command:

python tasks/DT_miniGPT/train_GPT.py --experiment_id time --cuda 2 --reward_strategy 1 --epochs 15 --fusion_type simple --target_rtg 5 --mode train

Feel free to contribute, report issues, or request features!

Citation

@article{li2024human,
    title={Human-Aware Vision-and-Language Navigation: Bridging Simulation to Reality with Dynamic Human Interactions},
    author={Minghan Li, Heng Li, Zhi-Qi Cheng, Yifei Dong, Yuxuan Zhou, Jun-Yan He, Qi Dai, Teruko Mitamura, Alexander G Hauptmann},
    journal={arXiv preprint arXiv:2406.19236},
    year={2024}
}