qfettes/DeepRL-Tutorials

In PPO.ipynb, the position of action loss epoch and value loss epoch need to be swapped.

wadx2019 opened this issue · 0 comments

In PPO.ipynb, the position of action loss epoch and value loss epoch need to be swapped and I suggest that you'd better use RMSprop as the optimizer and reduce the learning rate to make these RL model easier to converge.