shamanez/GAIL-with-WGAN-loss-for-the-Discriminator

This is about imitation learning using PPO and WGAN-GP loss. This is heavily influenced by GAIL-PPO repository in following link - https://github.com/uidilr/gail_ppo_tf. My agent will get converged to perform his task around 3384 iterations.

Python