tomasspangelo/proximal-policy-optimization

An implementation from the state-of-the-art family of reinforcement learning algorithms Proximal Policy Optimization using normalized Generalized Advantage Estimation and optional batch mode training. The loss function incorporates an entropy bonus.

Python

Watchers

drkostas
University of Tennessee, Knoxville
tomasspangelo
twoday