pat-coady/trpo

training issue

wenyijiang opened this issue · 4 comments

hello!
When I load the saved training humanoid-v1 model, it will be after several minutes can get the best result. eg: when I load the 200000 episodes' training model , at the start the human walks not good and after several minutes, it can walk better.
Why this happens?
Thanks!

Hi @wenyijiang

Did you add model saving capability? If so, I would like to add that to my implementation too.

I'm not sure I understand you question, can you re-state?

Thanks, Pat

1508333078 1

1508333011 1

Hi !
If I set the scaler to some fixed values and do not update them when training,Does it make sense?
I want to save the training model , so I add "saver=tf.train.Saver" to your code, and I get some model after training . But when I load these saved training model, at the start the human walks not good and after several minutes ,it can walk better. Why this happens?
The red line in the two pictures are the code I add ! These two are about value_funtion.py and policy.py
Thanks very much!!!

@wenyijiang
Sorry for the delay, I started a new job 2 weeks ago and have been quite busy.

It should be fine to have constant scaling, so that you can save a model and it will work OK. In fact, if you look in archive.py you will see ConstantScaler class that you should be able to use. The only issue is that some environments have wide range of observation scales. You may benefit from manually adjusting scaler to get parameters to similar range.

Let me know if performance is still bad when loading model when you are using a constant scaler.

Closed due to lack of activity.