An efficient implementation of the Proximal Policy Optimization (PPO) algorithm with linear and attention policy for reinforcement learning.
Primary LanguagePythonApache License 2.0Apache-2.0