The loss becomes neagative from positive values dring taining loop
yijianSU22 opened this issue · 5 comments
Hi, I just ran a unet model on a train set, and used the dice and crossentropy loss as a loss function,but t found that the loss value is not normal , it became negative geadually. As bellow:
2024-04-27 22:54:02.697477: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library libcublasLt.so.11
2617/2617 [==============================] - 9485s 4s/step - loss: 0.3995 - accuracy: 0.0302 - val_loss: 0.3482 - val_accuracy: 0.0182
Epoch 2/20
2617/2617 [==============================] - 9453s 4s/step - loss: 0.1805 - accuracy: 0.2205 - val_loss: 0.1516 - val_accuracy: 0.9400
Epoch 3/20
2617/2617 [==============================] - 9428s 4s/step - loss: 0.0435 - accuracy: 0.9362 - val_loss: 0.1033 - val_accuracy: 0.9482
Epoch 4/20
2617/2617 [==============================] - 9412s 4s/step - loss: -0.0293 - accuracy: 0.9398 - val_loss: 0.0141 - val_accuracy: 0.9459
Epoch 5/20
2617/2617 [==============================] - 9444s 4s/step - loss: -0.0844 - accuracy: 0.9420 - val_loss: -0.0150 - val_accuracy: 0.9548
Epoch 6/20
2617/2617 [==============================] - 9436s 4s/step - loss: -0.1212 - accuracy: 0.9440 - val_loss: -0.0363 - val_accuracy: 0.9599
Epoch 7/20
2617/2617 [==============================] - 9397s 4s/step - loss: -0.1537 - accuracy: 0.9457 - val_loss: -0.0193 - val_accuracy: 0.9538
Epoch 8/20
2617/2617 [==============================] - 9305s 4s/step - loss: -0.1777 - accuracy: 0.9467 - val_loss: -0.0149 - val_accuracy: 0.9526
Epoch 9/20
2617/2617 [==============================] - 8968s 3s/step - loss: -0.2004 - accuracy: 0.9473 - val_loss: -0.0841 - val_accuracy: 0.9576
Epoch 10/20
2617/2617 [==============================] - 8787s 3s/step - loss: -0.2210 - accuracy: 0.9480 - val_loss: -0.0822 - val_accuracy: 0.9571
Epoch 11/20
2617/2617 [==============================] - 8794s 3s/step - loss: -0.2337 - accuracy: 0.9486 - val_loss: -0.0837 - val_accuracy: 0.9566
Epoch 12/20
2617/2617 [==============================] - 8809s 3s/step - loss: -0.2521 - accuracy: 0.9492 - val_loss: -0.0856 - val_accuracy: 0.9615
Epoch 13/20
2617/2617 [==============================] - 8804s 3s/step - loss: -0.2688 - accuracy: 0.9500 - val_loss: -0.1012 - val_accuracy: 0.9594
Epoch 14/20
2617/2617 [==============================] - 8807s 3s/step - loss: -0.2867 - accuracy: 0.9508 - val_loss: -0.0994 - val_accuracy: 0.9599
Epoch 15/20
2617/2617 [==============================] - 8721s 3s/step - loss: -0.2949 - accuracy: 0.9511 - val_loss: -0.1008 - val_accuracy: 0.9605
Epoch 16/20
2617/2617 [==============================] - 8684s 3s/step - loss: -0.3071 - accuracy: 0.9515 - val_loss: -0.0705 - val_accuracy: 0.9564
Epoch 17/20
349/2617 [===>..........................] - ETA: 37:27 - loss: -0.0398 - accuracy: 0.9501
and this is my loss function:
class categorical_dicePcrossentropy_weight(tf.keras.losses.Loss):
def init(self,class_weight,lamda=0.5):
super().init()
self.lamda = lamda
self.weight = class_weight
def call(self, y_true, y_pred):
smooth = 1.e-5
smooth = tf.constant(smooth,tf.float32)
y_true = tf.cast(y_true,tf.float32)
y_pred = tf.cast(y_pred,tf.float32)
intersection = tf.math.reduce_sum(y_pred * y_true,axis=(1,2,3))
union = tf.math.reduce_sum((y_pred+y_true),axis=(1,2,3))
dice_coef = tf.math.reduce_sum(2 * (intersection + smooth) / (union + smooth),axis=0)
loss1 = tf.math.reduce_mean(self.weight * dice_coef)
epsilon = 1.e-5
output = y_pred/tf.math.reduce_sum(y_pred,axis=-1,keepdims=True)
output = tf.clip_by_value(output,epsilon,1-epsilon)
loss = y_true * tf.math.log(output)
loss = tf.math.reduce_mean(loss, axis=(1, 2, 3))
loss = tf.math.reduce_mean(loss, axis=0)
loss2 = tf.math.reduce_mean(self.weight * loss)
total_loss = (1 - self.lamda) * (1 - loss1) + self.lamda * loss2
return total_loss
I don't know why,Is there a way to resolve it?
- For small values
tf.math.log(output)
is negative tf.clip_by_value()
is not working for nan e.g. if output containsnan
thentf.clip_by_value(output,epsilon,1-epsilon)
also containsnan
if I'm not mistaken
- For small values
tf.math.log(output)
is negativetf.clip_by_value()
is not working for nan e.g. if output containsnan
thentf.clip_by_value(output,epsilon,1-epsilon)
also containsnan
if I'm not mistaken
Thanks very much, yes, you're right. here should be -y_true *tf.math.log(output)
- For small values
tf.math.log(output)
is negativetf.clip_by_value()
is not working for nan e.g. if output containsnan
thentf.clip_by_value(output,epsilon,1-epsilon)
also containsnan
if I'm not mistakenThanks very much, yes, you're right. here should be -y_true *tf.math.log(output)
hi,Sorry to bother you again,I don't know why I used the tf.keras.losses.CategoricalCrossentropy() to compute CE, the loss value still will be negative during training loop.
Hi @yijianSU22 ,
The Op tf.math.log(x)
outputs -inf
if the value of x
is 0
and nan
if x<0
. You can clip -inf values to a value you want using tf.clip_by_value
. But for nan
, clip_by_value
also returns nan
. SInce this is custom loss function, maybe you need to recheck it.