sweetice/Deep-reinforcement-learning-with-pytorch

I dont think PPO pendulum is converging

Bigpig4396 opened this issue · 4 comments

I dont think PPO pendulum is converging

Yes, the problem is that the activation function is chosen incorrectly.

I don't think this repo implement the PPO correctly either

change the activation function relu to tanh

right,change relu to tanh in actor network