In PPO.ipynb, the position of action loss epoch and value loss epoch need to be swapped.

Question

In PPO.ipynb, the position of action loss epoch and value loss epoch need to be swapped.

wadx2019 opened this issue 3 years ago · 0 comments

wadx2019 commented 3 years ago

In PPO.ipynb, the position of action loss epoch and value loss epoch need to be swapped and I suggest that you'd better use RMSprop as the optimizer and reduce the learning rate to make these RL model easier to converge.