N_iter is never updated in rainbow agent model

Question

N_iter is never updated in rainbow agent model

RaviKumarAndroid opened this issue 5 years ago · 8 comments

Thats also the reason why target_model weights are never updated. Its a bug

Answer 1 · 2020-05-26T14:58:14.000Z

You should update the n_iter variable value in act function of the agent.

Answer 2 · 2020-05-27T07:45:55.000Z

Hello @RaviKumarAndroid ,
Thank You for pointing that out, although I feel it also has to do with Categorical DQN missing in the implementation along with the use of Kullback Leibler divergence loss instead of Huber loss.

Answer 3 · 2020-05-28T17:57:31.000Z

Ok, But then where have you updated the target model weight. The target models weight are never updated at all anywhere if you say that n_iter is non useful. Because if you don't update the n_iter. The target model will never learn. because its neither fitted anywhere neither is its weights set from the online model.

Answer 4 · 2020-05-28T17:59:11.000Z

There are 2 ways to update the target model. either soft updates using the tau hyper parameter, or using hard updates by copying the weights to target_model from the online model every few thousand iterations.

Answer 5 · 2020-08-07T15:48:09.000Z

any update?

Answer 6 · 2020-08-07T17:50:39.000Z

Hello @lorrp1
I have been working on some other projects and didn't time to make the improvements mentioned by @RaviKumarAndroid ,
Although I have tried the hard update and it seems to worsen the current benchmark. I anticipate multiple changes required in order to improve the current benchmark and make the agent generalize on the real-time data. I don't have time at the moment. I will start working on this issue soon.

Answer 7 · 2020-08-17T19:19:39.000Z

Hello @lorrp1 @RaviKumarAndroid ,
Sorry for the delay, but the changes are made, although the performance of the agent has not improved. I have made agent do hard updates every 10000 iterations. I will soon add categorical dqn and make a few tweaks to the model itself. Will try a lot of hyperparameter tuning to get the best of the agents. Until then I will close this issue.

Answer 8 · 2020-08-17T20:21:15.000Z

Thank you, I’ll check it soon.