Humongous mean loss value problem during evaluation
gabes21 opened this issue · 2 comments
Hello @mathieuorhan, I managed to use your code to train my model but during training I noticed a very huge mean loss value in every evaluation result. What could be causing that and do you have any idea on how to solve this issue? When i use that model to make prediction, it predict the same label for every points. I use xyz and rgb where the rgb value is initially 16 bit integer but I already converted it to 0-255 range.
---- EPOCH 190 EVALUATION ----
Progress: [----------] 0.0% checkinglossval
2.7039943e+20
Progress: [#####-----] 50.0% checkinglossval
2.4945753e+20
Progress: [##########] 100% Done...
mean loss: 259928480659827326976.000000
hulahulahulahoop
49152.0
Overall accuracy : 0.309652
Average IoU : 0.077413
IoU of rumput : 0.000000
IoU of segment : 0.000000
IoU of atap : 0.309652
hulahulahulahoop
49152.0
Model saved in file: logs/semantic/model.ckpt
**** EPOCH 191 ****
2023-04-03 05:09:51.769989
Progress: [##########] 100% Done...
mean loss: 0.046747
hulahulahulahoop
172032.0
Overall accuracy : 0.994327
Average IoU : 0.741325
IoU of rumput : 0.991025
IoU of segment : 0.982337
IoU of atap : 0.991939
**** EPOCH 192 ****
2023-04-03 05:09:53.309120
Progress: [##########] 100% Done...
mean loss: 0.043574
hulahulahulahoop
172032.0
Overall accuracy : 0.994629
Average IoU : 0.742074
IoU of rumput : 0.992080
IoU of segment : 0.985111
IoU of atap : 0.991106
**** EPOCH 193 ****
2023-04-03 05:09:54.814600
Progress: [##########] 100% Done...
mean loss: 0.048196
hulahulahulahoop
172032.0
Overall accuracy : 0.993937
Average IoU : 0.740546
IoU of rumput : 0.990235
IoU of segment : 0.979310
IoU of atap : 0.992638
**** EPOCH 194 ****
2023-04-03 05:09:56.928389
Progress: [##########] 100% Done...
mean loss: 0.028117
hulahulahulahoop
172032.0
Overall accuracy : 0.996692
Average IoU : 0.745103
IoU of rumput : 0.994958
IoU of segment : 0.990916
IoU of atap : 0.994536
**** EPOCH 195 ****
2023-04-03 05:09:59.347019
Progress: [##########] 100% Done...
mean loss: 0.029439
hulahulahulahoop
172032.0
Overall accuracy : 0.996966
Average IoU : 0.745522
IoU of rumput : 0.997603
IoU of segment : 0.991430
IoU of atap : 0.993053
2023-04-03 05:10:01.641992
---- EPOCH 195 EVALUATION ----
Progress: [----------] 0.0% checkinglossval
2.5435268e+20
Progress: [#####-----] 50.0% checkinglossval
2.634496e+20
Progress: [##########] 100% Done...
mean loss: 258901140975298543616.000000
hulahulahulahoop
49152.0
Overall accuracy : 0.345317
Average IoU : 0.086329
IoU of rumput : 0.000000
IoU of segment : 0.000000
IoU of atap : 0.345317
hulahulahulahoop
49152.0
Hello @mathieuorhan, am i supposed to use an activation function at the end of the last layer of net in get_model function or leave the activation function as None?
My mean loss value become normal now. The problem is caused by the contrib code that i initially modified in tf_util.py. After changing this :
initializer = tf.contrib.layers.xavier_initializer() return tf.contrib.layers.batch_norm(inputs, center=True, scale=True, is_training=is_training, decay=bn_decay,updates_collections=None, scope=scope, data_format=data_format)
into this:
initializer = tf.keras.initializers.glorot_uniform() return tf.layers.batch_normalization(inputs, center=True, scale=True, training=is_training, momentum=bn_decay)
My mean loss result stop reaching millions.