/dip

Proximal Policy Optimisation (PPO) PyTorch implementation for the inverted double pendulum problem

Primary LanguageJupyter Notebook

Watchers