ICML'20: Intrinsic Reward Driven Imitation Learning via Generative Model

Primary LanguagePython


ICML'20: Intrinsic Reward Driven Imitation Learning via Generative Model (Pytorch implementation).


This is the code for the paper: Intrinsic Reward Driven Imitation Learning via Generative Model
Xingrui Yu, Yueming Lyu and Ivor W. Tsang
Presented at ICML 2020.

If you find this code useful in your research then please cite

  title={Intrinsic Reward Driven Imitation Learning via Generative Model},
  author={Yu, Xingrui and Lyu, Yueming and Tsang, Ivor
  booktitle={International Conference on Machine Learning},

Our implementation is based on the repo: https://github.com/ikostrikov/pytorch-a2c-ppo-acktr-gail.

Please refer to the repo for installation, or

create experimental environment by:

conda env create -f imitation_mpi_env.yml

We achieve final results by following these steps:

  1. train expert:

    sh scripts/expert_atari.sh
  2. generate one-life demonstration:

    sh scripts/gen_one_life_demonstration.sh
  3. train reward module:

    sh scripts/train_reward_module.sh
  4. policy optimizaion with the learned reward:

    sh scripts/policy_optimization.sh