cost: -nan
tanya-ling opened this issue · 1 comments
I encounter cost: -nan after a few iterations (from 5 to 7) on my own dataset of 39 Gb. The vectors are also all nans.
I have read this issue, but decreasing learning rate doesn't solve my problem (I tried 0.05, 0.005, 0.0005 and 0.0001, the same results).
The cost is decreasing for a few steps, but then goes to NaN. Like this:
I have a really small vocabulary size of less than 5000 (that's intentional, I pretokenized my corpus this way) and a large vector size (tried 500 and 1000).
Cooccurrence matrix seems to be constructed fine and weights about 396 Mb. vocab file also looks good.
And even resulted vectors after a few iterations (before -nan appears) seem good and do not completely fail on lexical similarity task.
However, I would like to continue training, since I am not sure that the model has converged.
Please, give some advice about how to avoid -nan in cost and vectors.
I installed glove with
$ git clone http://github.com/stanfordnlp/glove
$ cd glove && make
if it matters.
I merged in a change which clips the gradients so they don't go infinite. Hopefully that resolves the nans.