/gail-pytorch

PyTorch implementation of GAIL and PPO reinforcement learning algorithms

Primary LanguagePython

Generative Adversarial Imitation Learning

PyTorch implementation of the paper:

Ho, Jonathan, and Stefano Ermon. "Generative adversarial imitation learning." Proceedings of the 30th International Conference on Neural Information Processing Systems. 2016.

We also present a report with theoretical and empirical studies based on our understanding of the paper and other related works.

Installation

pip install -r requirements.txt
pip install -e .

[optional] conda install swig
[optional] pip install box2d-py

Note: swig and box2d-py are required only by LunarLander-v2 environment.

Run Setup

Have a look at the parameters set in the corresponding run config files before executing these commands. We provide some example pretrained models and sampled expert trajectories to directly work with as well.

Train PPO to learn expert policy

python ppo.py --config config/CartPole-v0/config_ppo.json

Sample expert trajectories

python traj.py --config config/CartPole-v0/config_traj.json

Train GAIL for imitation learning

python main.py --config config/CartPole-v0/config_gail.json

Generate training graphs

python visualize.py --env_id CartPole-v0 --out_dir ../pretrained --model_name ppo
python visualize.py --env_id CartPole-v0 --out_dir ../pretrained --model_name gail

Cartpole-v0 Experiment

References

  1. GitHub: nav74neet/gail_gym
  2. GitHub: nikhilbarhate99/PPO-PyTorch
  3. Medium: Article on GAIL
  4. Blog post on PPO algorithm
  5. White Paper on MCE IRL
  6. Blog post on PPO
  7. Blog post on TRPO

Acknowledgements

This work has been completed as a course project for CS498: Reinforcement Learning course taught by Professor Nan Jiang. I thank our instructor and course teaching assistants for their guidance and support throughout the course.

Contact

Jatin Arora

University Mail: jatin2@illinois.edu

External Mail: jatinarora2702@gmail.com

LinkedIn: linkedin.com/in/jatinarora2702