Warmup causes NAN
lyw615 opened this issue · 1 comments
lyw615 commented
I use RAdam in maskrcnn with keras implement. But after warmup completed, the loss value get NAN. If SGD used, without warmup, the loss value is normal. I just do the operation during heads layers training. Usage is as following, and any reply will be appreciated: learning_rate is 0.001.
if warmup:
optimizer_use=RAdam(learning_rate=1e-5,total_steps=all_steps,
warmup_proportion=0.05,min_lr=learning_rate)`
else:
optimizer_use=RAdam()
self.keras_model.compile(optimizer=optimizer_use, loss=[ None] * len(self.keras_model.outputs))
stale commented
Is this still relevant? If so, what is blocking it? Is there anything you can do to help move it forward?
This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs.