rl_games won't load trained model with command line arguemnt `--checkpoint=` given

Question

rl_games won't load trained model with command line arguemnt `--checkpoint=` given

Closed this issue 2 years ago · 3 comments

Problem

I am using rl-games with IsaacGym to train my RL agent. However, when I was trying to use the --checkpoint= command line argument to resume the training, I found that the training always restarts from the very begining. I uses rl-games in the way below:

runner = Runner(algo_observer)
runner.load(cfg_train)
runner.reset()
runner.run(args)

and resume my training with command:

$ python ./rlg_train.py --task=[my_task_name] --checkpoint=[absolute path of trained model]

My Solution

I take a look at the source code, and found that the class method Runner.run_train(self) has a duplicated load_config() command.

else:
    self.reset()
    **self.load_config(self.default_config)**

This line causes the command line argument --checkpoint be covered by configurations in config file.

I thought that this command should be deleted, and another command should be added in Runner.run(self, arg) function:

if 'checkpoint' in args and args['checkpoint'] is not None:
    if len(args['checkpoint']) > 0:
        **self.load_check_point = True**
        self.load_path = args['checkpoint']

so that I can use command line argument to resume the training without modifying my config file.

Could you please take a look and check if I've gotten it right? Thanks a lot!

Answer 1 · 2022-01-17T22:44:57.000Z

Hi @chaojie-fu.
What do you think If I just remove this option from config and leave one from command line only?
Looks like it is not convenient to update config anyway.

Answer 2 · 2022-01-18T06:03:59.000Z

Removing this option from config file should be fine and for me, command line option is more convenient for testing.

Answer 3 · 2022-01-21T06:13:23.000Z

#117 fixed here. I removed it from yaml file parser. so only --checkpoint works now.