This is the official PyTorch implementation of our NeurIPS 2022 paper "When to Ask for Help: Proactive Interventions in Autonomous Reinforcement Learning" by Annie Xie*, Fahim Tajwar*, Archit Sharma*, Chelsea Finn. Please see the project website for example results. For any questions/concerns related to the codebase, please reach out to Fahim Tajwar.
If you use this repo in your research, please consider citing our paper:
@inproceedings{xie2022paint,
author = {Xie, Annie and Tajwar, Fahim and Sharma, Archit and Finn, Chelsea},
booktitle = {Advances in Neural Information Processing Systems},
title = {When to Ask for Help: Proactive Interventions in Autonomous Reinforcement Learning},
volume = {35},
year = {2022}
}
Install MuJoCo if it is not already the case:
- Obtain a license on the MuJoCo website.
- Download MuJoCo binaries here.
- Unzip the downloaded archive into
~/.mujoco/mujoco200
and place your license key filemjkey.txt
at~/.mujoco
. - Use the env variables
MUJOCO_PY_MJKEY_PATH
andMUJOCO_PY_MUJOCO_PATH
to specify the MuJoCo license key path and the MuJoCo directory path. - Append the MuJoCo subdirectory bin path into the env variable
LD_LIBRARY_PATH
.
Install the following libraries:
sudo apt update
sudo apt install libosmesa6-dev libgl1-mesa-glx libglfw3
Add mujuco/code files to bashrc: Add the following lines to ~/.bashrc:
export LD_LIBRARY_PATH=$LD_LIBRARY_PATH:~/.mujoco/mujoco200/bin
export PYTHONPATH=$PYTHONPATH:~/proactive_interventions/
export PYTHONPATH=$PYTHONPATH:~/proactive_interventions/envs/
export PYTHONPATH=$PYTHONPATH:~/proactive_interventions/envs/sticky_wall_env
Make sure the paths you enter matches the actual path in your device instead of the ones in the template above. Then source the bashrc file:
source ~/.bashrc
Install dependencies:
conda env create -f paint/conda_env.yml
conda activate paint
First, make sure you are in the "paint" directory.
Train an episodic PAINT agent (maze):
bash run_scripts/maze.sh
Train a PAINT agent in a continuing task setting (cheetah):
bash run_scripts/cheetah.sh
Train a non-episodic (forward-backward) PAINT agent:
(Tabletop manipulation)
bash run_scripts/tabletop.sh
(Peg insertion)
bash run_scripts/peg.sh
Monitor results:
tensorboard --logdir exp_local
The codebase for the algorithm is built on top of the PyTorch implementation of DrQ-v2, original codebase linked here. The codebase for our environments with irreversibility is built on top of the codebase for EARL Benchmark, original codebase linked here. We thank the authors for providing us with easy-to-work-with codebases.