I dont think PPO pendulum is converging

Question

Bigpig4396 opened this issue 5 years ago · 4 comments

Answer 1 · 2019-09-28T08:43:39.000Z

Yes, the problem is that the activation function is chosen incorrectly.

Answer 2 · 2020-03-08T17:05:43.000Z

I don't think this repo implement the PPO correctly either

Answer 3 · 2020-03-18T13:51:47.000Z

change the activation function relu to tanh

Answer 4 · 2021-05-07T08:58:22.000Z

right,change relu to tanh in actor network