Proactive Interventions in Autonomous Reinforcement Learning

This is the official PyTorch implementation of our NeurIPS 2022 paper "When to Ask for Help: Proactive Interventions in Autonomous Reinforcement Learning" by Annie Xie*, Fahim Tajwar*, Archit Sharma*, Chelsea Finn. Please see the project website for example results. For any questions/concerns related to the codebase, please reach out to Fahim Tajwar.

Citation

If you use this repo in your research, please consider citing our paper:

@inproceedings{xie2022paint,
 author = {Xie, Annie and Tajwar, Fahim and Sharma, Archit and Finn, Chelsea},
 booktitle = {Advances in Neural Information Processing Systems},
 title = {When to Ask for Help: Proactive Interventions in Autonomous Reinforcement Learning},
 volume = {35},
 year = {2022}
}

Installation

Install MuJoCo if it is not already the case:

Obtain a license on the MuJoCo website.
Download MuJoCo binaries here.
Unzip the downloaded archive into ~/.mujoco/mujoco200 and place your license key file mjkey.txt at ~/.mujoco.
Use the env variables MUJOCO_PY_MJKEY_PATH and MUJOCO_PY_MUJOCO_PATH to specify the MuJoCo license key path and the MuJoCo directory path.
Append the MuJoCo subdirectory bin path into the env variable LD_LIBRARY_PATH.

Install the following libraries:

sudo apt update
sudo apt install libosmesa6-dev libgl1-mesa-glx libglfw3

Add mujuco/code files to bashrc: Add the following lines to ~/.bashrc:

export LD_LIBRARY_PATH=$LD_LIBRARY_PATH:~/.mujoco/mujoco200/bin
export PYTHONPATH=$PYTHONPATH:~/proactive_interventions/
export PYTHONPATH=$PYTHONPATH:~/proactive_interventions/envs/
export PYTHONPATH=$PYTHONPATH:~/proactive_interventions/envs/sticky_wall_env

Make sure the paths you enter matches the actual path in your device instead of the ones in the template above. Then source the bashrc file:

source ~/.bashrc

Install dependencies:

conda env create -f paint/conda_env.yml
conda activate paint

Running experiments

First, make sure you are in the "paint" directory.

Train an episodic PAINT agent (maze):

bash run_scripts/maze.sh

Train a PAINT agent in a continuing task setting (cheetah):

bash run_scripts/cheetah.sh

Train a non-episodic (forward-backward) PAINT agent:

(Tabletop manipulation)

bash run_scripts/tabletop.sh

(Peg insertion)

bash run_scripts/peg.sh

Monitor results:

tensorboard --logdir exp_local

Acknowledgements

The codebase for the algorithm is built on top of the PyTorch implementation of DrQ-v2, original codebase linked here. The codebase for our environments with irreversibility is built on top of the codebase for EARL Benchmark, original codebase linked here. We thank the authors for providing us with easy-to-work-with codebases.

sanggusti/proactive_interventions

Proactive Interventions in Autonomous Reinforcement Learning

Citation

Installation

Running experiments

Acknowledgements