A3C Doom: Basic scenario: How to select clipping?
Closed this issue · 5 comments
Why 40.0?
grads,self.grad_norms = tf.clip_by_global_norm(self.gradients,40.0)
Hi Ibrahim,
This is something that should be adjusted based on the performance findings of your own task. 40 is what was used in the OpenAI starter agent, so I used that here, as it led to convergence for the example task.
Would it be wise to use the Grad Norm plot from this projects Tensorboard to select a value that is close to the mean-max of the smoothed chart to prevent overly large updates? Along those lines, is it worth considering an adaptive gradient norm clip derived from a moving average of the norm?
Thank you @DMTSource for elaboration
Could you please give numerical examples? (for better understanding)
If Grad Norm is around 25, then we should set clipping = 25?
When Grad Norm is decrease by time (say 20), then we should set clipping = 20?
Yes that is what I am thinking. Looking at your png here, my question above asks if setting the clip value to between ~15-20 is appropriate. Or in a dynamic sense, to recreate the smoothed values seen in Tensorboard(careful they are smoothed values of smoothed values) and use n_std*std_of_norms_rolling + mean_of_norms_rolling to determine an upper bound that does not throw away information.
But back to my above question for @awjuliani : is visual inspection of the Grad Norm plot even a valid way to determine the grad norm clip value or will the gradient updates change magnitude with each hyperparamter update?
I hope this would solve the gathering health scenario (till now I failed to make it converge)
waiting for @awjuliani ...