Issues
- 3
BreakoutNoFrameskip-v4 does not converge
#7 opened by initial-h - 3
- 1
Pendulum-v0 doesnt converge
#6 opened by rapop - 3
can we use PPO for discrete action spaces ?
#2 opened by shamanez - 1
- 2
Incorrect Surrogate Loss Equation
#3 opened by Supermaxman - 6
policy network
#1 opened by huiwenzhang