ucbdrive/skipnet

Nan is encountered when training imagenet_rnn_gate_rl_50

Opened this issue · 1 comments

I have met a problem when training imagenet_rnn_gate_rl_50 using offered pretrained sp model:

10-26-18 02:30:Epoch: [4][4010/5004] Time 1.793 (1.839) Data 0.000 (0.003) Loss nan (nan) Total rewards nan (nan) Prec@1 0.391 (75.964) Prec@5 1.953 (91.181)
10-26-18 02:30:total gate rewards = 2.560
10-26-18 02:30:*** Computation Percentage: 97.532 %
10-26-18 02:30:Epoch: [4][4020/5004] Time 1.754 (1.838) Data 0.000 (0.003) Loss nan (nan) Total rewards nan (nan) Prec@1 0.000 (75.775) Prec@5 0.000 (90.954)

I didn't change any default configure and the loss and rewards became "nan", is there any other matters needing attention. Please help

Thanks,
Willy

Really sorry about the late reply. Not sure if you have already solved the issue... Did you try to reduce the learning rate? If this is still bothering you, I can take a look at the code and re-run the experiments for checking.

Thanks,
Xin