/pvp

Official release for the code used in paper: Learning from Active Human Involvement through Proxy Value Propagation (NeurIPS 2023 Spotlight)

Primary LanguagePython

NeurIPS 2023 Spotlight

Official release for the code used in paper: Learning from Active Human Involvement through Proxy Value Propagation

Webpage | Code | Poster | Paper

Installation

# Clone the code to local machine
git clone https://github.com/metadriverse/pvp
cd pvp

# Create Conda environment
conda create -n pvp python=3.7
conda activate pvp

# Install dependencies
pip install -r requirements.txt
pip install -e .

# Install evdev package (Linux only)
pip install evdev


# You now have installed MetaDrive and MiniGrid.
# To set up CARLA dependencies, please click the details below.
Set up CARLA dependencies
# Step 1: Download and unzip CARLA 0.9.10.1 to your home folder
cd ~/
wget https://carla-releases.s3.eu-west-3.amazonaws.com/Linux/CARLA_0.9.10.1.tar.gz
export CARLA_ROOT="CARLA_0.9.10.1"
mkdir ${CARLA_ROOT}
tar -xf CARLA_0.9.10.1.tar.gz -C ${CARLA_ROOT}  # CARLA is stored at: ~/CARLA_0.9.10.1

# Step 2: Setup the environment variables
vim ~/.bashrc
# Add following sentences and replace PATH_TO_CARLA_ROOT with the path to ${CARLA_ROOT} 
export CARLA_ROOT="~/CARLA_0.9.10.1"
export PYTHONPATH="${CARLA_ROOT}/PythonAPI/carla/":"${CARLA_ROOT}/PythonAPI/carla/dist/carla-0.9.10-py3.7-linux-x86_64.egg":${PYTHONPATH}

# Step 3: Activate your conda environment and test if CARLA is installed correctly.
conda activate pvp  # If you are using conda environment "pvp"
python -c "import carla"  # If no error raises, the installation is successful.

# Step 4: Install dependencies
pip install DI-engine==0.2.2
pip install torchvision
pip install markupsafe==2.0.1

# NOTE: If you are using a new conda environment, you might need to reinstall 'pvp' repo.
# Now let's jump to the CARLA section to run experiment!

Launch Experiments

MetaDrive

Metadrive provides options for three control devices: steering wheel, gamepad and keyboard.

During experiments human subject can always press E to pause the experiment and press Esc to exit the experiment. The main experiment will run for 40K steps and takes about one hour. For toy environment with --toy_env, it takes about 10 minutes.

Click for the experiment details:

MetaDrive - Keyboard
# Go to the repo root
cd ~/pvp

# Run toy experiment
python pvp/experiments/metadrive/train_pvp_metadrive.py \
--device keyboard \
--toy_env \
--exp_name pvp_metadrive_toy_keyboard

# Run full experiment
python pvp/experiments/metadrive/train_pvp_metadrive.py \
--device keyboard \
--exp_name pvp_metadrive_keyboard \
--wandb \
--wandb_project WADNB_PROJECT_NAME \
--wandb_team WANDB_ENTITY_NAME
Action Control
Steering A/D
Throttle W
Human intervention Space or WASD
MetaDrive - Steering Wheel (Logitech G29)

Note: Do not connect Xbox controller with the steering wheel at the same time!

# Go to the repo root
cd ~/pvp

# Run toy experiment
python pvp/experiments/metadrive/train_pvp_metadrive.py \
--device wheel \
--toy_env \
--exp_name pvp_metadrive_toy_wheel

# Run full experiment
python pvp/experiments/metadrive/train_pvp_metadrive.py \
--device wheel \
--exp_name pvp_metadrive_wheel \
--wandb \
--wandb_project WADNB_PROJECT_NAME \
--wandb_team WANDB_ENTITY_NAME
Action Control
Steering Steering wheel
Throttle Throttle pedal
Human intervention Left/Right gear shifter
MetaDrive - Gamepad (Xbox Wireless Controller)

Note: Do not connect Xbox controller with the steering wheel at the same time!

# Go to the repo root
cd ~/pvp

# Run toy experiment
python pvp/experiments/metadrive/train_pvp_metadrive.py \
--device gamepad \
--toy_env \
--exp_name pvp_metadrive_toy_gamepad

# Run full experiment
python pvp/experiments/metadrive/train_pvp_metadrive.py \
--device gamepad \
--exp_name pvp_metadrive_gamepad \
--wandb \
--wandb_project WADNB_PROJECT_NAME \
--wandb_team WANDB_ENTITY_NAME
Action Control
Steering Left-right of Left Stick
Throttle Up-down of Right Stick
Human intervention X/A/B & Left/Right Trigger

CARLA

We use CARLA 0.9.10.1 as the backend and use the environment created by DI-Drive as the gym interface. CARLA uses a server-client architecture. To run experiment, launch the server first:

# Launch an independent terminal, then:
cd ~/CARLA_0.9.10.1  # Go to your CARLA root
./CarlaUE4.sh -carla-rpc-port=9000  -quality-level=Epic  # Can set to Low to accelerate
# Now you should see a pop-up window and you can use WASD to control the camera.

Click for the experiment details:

CARLA - Steering Wheel (Logitech G29)

Note: Do not connect Xbox controller with the steering wheel at the same time!

# Launch the CARLA server if you haven't done yet
~/CARLA_0.9.10.1/CarlaUE4.sh -carla-rpc-port=9000  -quality-level=Epic  # Can set to Low to accelerate

# Go to the repo root
cd ~/pvp

# Run experiment without Wandb:
python pvp/experiments/carla/train_pvp_carla.py --exp_name pvp_carla_test

# Run full experiment
python pvp/experiments/metadrive/train_pvp_metadrive.py \
--exp_name pvp_carla \
--wandb \
--wandb_project WADNB_PROJECT_NAME \
--wandb_team WANDB_ENTITY_NAME
Action Control
Throttle Throttle pedal
Human intervention Left/Right gear shifter
Steering Steering wheel

Minigrid

Click for the experiment details:

MiniGrid - Keyboard

Mapping between environment nick name --env and env_id:

  • emptyroom - MiniGrid-Empty-6x6-v0
  • tworoom - MiniGrid-MultiRoom-N2-S4-v0
  • fourroom - MiniGrid-MultiRoom-N4-S5-v0
# Go to the repo root
cd ~/pvp

# Run experiment without Wandb:
python pvp/experiments/minigrid/train_pvp_minigrid.py --exp_name pvp_minigrid_test

# Run full experiment
# Choose --env from ["emptyroom", "tworoom", "fourroom"]
python pvp/experiments/minigrid/train_pvp_minigrid.py \
--env tworoom \
--exp_name pvp_minigrid \
--wandb \
--wandb_project WADNB_PROJECT_NAME \
--wandb_team WANDB_ENTITY_NAME
Action Control
Turn Left Left
Turn Right Right
Gown Straight Up
Approve Agent Action Space / Down
Open Door / Toggle T
Pickup P
Drop D
Done Complete Task D

📎 References

@inproceedings{peng2023learning,
  title={Learning from Active Human Involvement through Proxy Value Propagation},
  author={Peng, Zhenghao and Mo, Wenjie and Duan, Chenda and Li, Quanyi and Zhou, Bolei},
  booktitle={Thirty-seventh Conference on Neural Information Processing Systems},
  year={2023}
}