After few step Loss explodes/vanishes and than in the later epochs it produces 'Nan' as loss output.
cjt222 opened this issue · 17 comments
我应用到crnn去做ocr,就会出现这个问题,调整学习率也没效果
Problem: After few step Loss explodes and than in the later epochs it produces 'Nan' as loss output.
It is suffering from exploding gradients problem!
Solution: Gradient Clipping!
Try this and tell me if the problem is reoccurring
Gradient Clipping:
Apply: tf.clip_by_value(clipping_variable,1e-10,1.0)
logits=tf.clip_by_value(logits,1e-10,1.0)
Thanks for your help, loss explodes problem is solved after add Gradient Clipping, but loss may not converge
Try different Learning rates and alpha, gamma values
Try:
logits=tf.clip_by_value(logits,1e-7,1.0-1e-7)
or clipping is required near power function in the line where it calculates gamma
What about p values? Are they still zero?
If they are still zero then try different function to calculate exp().
most of it is zeros and some of it may be 1e-37,i have try tf.math.exp() instead of tf.exp, result is same and not converge
Upper Limit of clipping would be an issue.
As the values are very small in p there might be issue in upper limit of clipping.
Try printing values of p without clipping.
Then deduce which range of values would be good for clipping.
Once you get values of p in some appropriate range (and not 0 or very small number) after applying clipping, it should converge.
In fact, loss becomes nan when p values is still zeros without clipping, so I can not get values of appropriate range
Now, I try to train only with ctc loss unitl converage and fintinue with focal loss, and i will update while i get result
Also keep a check on ctc_loss output range.
Final Solution:
Clip ctc_loss()
instead of gradients.
I am closing this issue now!
@cjt222 hi,does FocalCtcLoss improve your CRNN accuracy?