
Training loss

VincentChong123 opened this issue · 2 comments

Hi @frotms , @mttbx, @MenSanYan,

Based on your, I added training code below but I cannot reduce training loss from 6.9.

Do you have sample training code and its training loss? Or do you see any errors in my training code?


if __name__ == "__main__":
print("begin ...")
input_test = tf.zeros([2, 224, 224, 3])
num_classes = 1000

if 0:
    model, end_points = mobilenet_v3_small(input_test, num_classes, multiplier=1.0, is_training=True, reuse=None)
    t_steps = 1000
    t_batch = 128
    input_rand = tf.random.uniform(shape=(t_batch, 224, 224, 3), minval=0, maxval=1)
    x_batch = input_rand
    y_batch = tf.random.uniform(shape=(t_batch,), minval=0, maxval=1000, dtype=tf.int32)

    logits, end_points = mobilenet_v3_small(x_batch, num_classes, multiplier=1.0, is_training=True, reuse=None)

    loss = tf.reduce_mean(tf.nn.sparse_softmax_cross_entropy_with_logits(logits=logits, labels=y_batch))
    #train_ops = tf.train.AdamOptimizer(learning_rate=0.0001).minimize(loss)
    train_ops = tf.train.GradientDescentOptimizer(learning_rate=0.01).minimize(loss)

    sess = tf.Session()
    for s in range(t_steps):
        _, loss_batch =[train_ops, loss])

        print("steps {:05d} loss {:03f}".format(s, loss_batch))

print("done !")

steps 00000 loss 6.914634
steps 00001 loss 6.907555
steps 00002 loss 6.905149
steps 00003 loss 6.905774
steps 00004 loss 6.904990

You lost parameters of batch normalization layer. @weishengchong

with tf.control_dependencies(tf.get_collection(tf.GraphKeys.UPDATE_OPS)):
    train_ops = optimizer.minimize(loss)

Hi @frotms ,
Thanks for your advise, it works.

I also use correction below to input constant random number rather than tf.random.uniform that always updates its value during training.

        x_batch  = tf.constant(np.random.uniform(low=0, high=1.0, size=(t_batch, 224, 224,3)).astype(np.float32))
        y_batch = tf.constant(np.random.uniform(size=(t_batch,), low=0, high=num_classes).astype(np.int32))