1jsingh/rl_pong

Train a RL agent to play Pong using Proximal Policy Optimization (PPO)

Jupyter NotebookMIT

About

Train a RL agent to play Pong using Proximal Policy Optimization (PPO)

Output demo

The player on the left is normal computer player while the one on the right is the implemented RL agent.

Using REINFORCE

Using PPO