/ppo-self-play

Reinforcement Learning | Multi-Agent RL | Self-Play | Proximal Policy Optimization Algorithm (PPO) agent | Unity Tennis environment

Primary LanguagePythonMIT LicenseMIT

Stargazers