nan loss function when b approaches 0
gm-spacagna opened this issue · 2 comments
I have tried to solve the problem with nan loss and I found this trick to be helpful: adding the epsilon constant to the argument of np.log:
loglikelihoods = u * \
K.log(K.exp(hazard1 - hazard0) - 1.0 + epsilon) - hazard1
This way when b ~ 0, thus hazard1 = hazard0, the logarithm is always defined.
Line 202 in c0075a7
I solved it in the same way as you did. One issue I found however was that the loss still went to NaNs when training on GPUs (I suspect due to GPU float32 constraints but I'm no expert here).
To make it run on the GPU I replaced epsilon with 1e-6 instead:
exp = k.exp(hazard1 - hazard0) + 1e-6
log = k.log(exp - 1)
loglk = -1 * k.mean((u_ * log) - hazard1)
and it seemed to work..
edit* : When I was looking into this I was observing the loss function output using two different functions. Both did the same operations, one used tensorflow the other one used numpy. I used tf.float32, tf.float64, np.float32 and np.float64 dtype values of alphas and betas and I ended up with nans only with the tf.float32 option.
Great @FedericoNutmeg and relevant to #51.
Currently (on develop+master) it looks like
Line 169 in 2661265
Where epsilon
is K.epsilon()
which I think defaults to whatever's in your .keras
json. I suggest changing it which should actually be warned for but the current message is wrong and stupid, didn't have time to test it yet. Try keras.backend.set_epsilon(1e-6)
and it should behave as you suggested.
In the numeric stability tests I might actually be using float64, this should be updated.