Denys88/rl_games

Error when loading agent weights

Closed this issue · 5 comments

BY571 commented

Hey,

first of all, thank you for the great work! I encountered your repo due to the IsaacGymEnvs and was training some Trifinger agents.
However, when trying to load the trained weights I'm getting the following error:

RuntimeError: Error(s) in loading state_dict for Network: Unexpected key(s) in state_dict: "value_mean_std.running_mean", "value_mean_std.running_var", "value_mean_std.count"

I'm running from the basically same repository and did not change any parameter in the config.

Hi @BY571,

Thank you for using rl-games and IsaacGymEnvs. What version of rl-games did you use? Also + @ArthurAllshire

Can you try installing rl-games 1.1.3?

BY571 commented

Thanks for the help! Indeed, on my remote machine, a different rl-games version got installed fixing the versioning solved the problem.

BY571 commented

The versioning seems to cause some troubles when testing. When I run with 1.3.1 the Trifinger robot does not move at all. However, when I load the same weights with version 1.1.3 the robot works and moves around lifting the cube. But I haven't found out yet what causes this problem.

@BY571 I've found the issue.
IsaacGym never passed checkpoint to the command line. And i removed this option from the config because it added some mess.
The quickest solution is: to fix train.py in the isaacgym
runner.run({
'train': not cfg.test,
'play': cfg.test,
'checkpoint' : cfg.checkpoint
})

Also you cannot load weights from 1.1.3 to the 1.3.1 right now. If you need it and cannot retrain robot I can make a simple script for you.