susanth-24/Proximal-Policy-Optimization
Proximal Policy Optimization, or PPO, is a policy gradient method for reinforcement learning. The motivation was to have an algorithm with the data efficiency and reliable performance of TRPO
Jupyter Notebook
Stargazers
No one’s star this repository yet.