"PPO_Continuous.py" trained 1000 EP without effect
Synmul opened this issue · 4 comments
Synmul commented
MedhaviMonish commented
Similar thing happened to me . I tried A2C continuous for pendulum without any change (except total episode was set to 3000) but reward is still varies between -1000 to -0 , it rarely goes to -0. So i tried A2C discrete without any change for cartpole and again it is too slow to train ..
natetsang commented
I received the same results - PPO continuous doesn't appear to learn anything. I'm running TF2.3, so it doesn't have to do with your version @Synmul
alifrahmatullah commented
Same here. No changes