takuseno/ppo

Proximal Policy Optimization implementation with TensorFlow

PythonMIT

Issues

BreakoutNoFrameskip-v4 does not converge
#7 opened 5 years ago by initial-h
3
Does it matter that make the pi and value in a single network?
#8 opened 5 years ago by initial-h
3
Pendulum-v0 doesnt converge
#6 opened 5 years ago by rapop
1
can we use PPO for discrete action spaces ?
#2 opened 6 years ago by shamanez
3
Why is total loss the sub of losses instead of sum?
#5 opened 6 years ago by t-mullen
1
Incorrect Surrogate Loss Equation
#3 opened 6 years ago by Supermaxman
2
policy network
#1 opened 7 years ago by huiwenzhang
6