Training loss

Question

Training loss

VincentChong123 opened this issue 5 years ago · 2 comments

Based on your mobilenet_v3.py, I added training code below but I cannot reduce training loss from 6.9.

Do you have sample training code and its training loss? Or do you see any errors in my training code?

if __name__ == "__main__":
print("begin ...")
input_test = tf.zeros([2, 224, 224, 3])
num_classes = 1000

if 0:
    model, end_points = mobilenet_v3_small(input_test, num_classes, multiplier=1.0, is_training=True, reuse=None)
else:
    t_steps = 1000
    t_batch = 128
    tf.random.set_random_seed(1)
    input_rand = tf.random.uniform(shape=(t_batch, 224, 224, 3), minval=0, maxval=1)
    x_batch = input_rand
    y_batch = tf.random.uniform(shape=(t_batch,), minval=0, maxval=1000, dtype=tf.int32)

    logits, end_points = mobilenet_v3_small(x_batch, num_classes, multiplier=1.0, is_training=True, reuse=None)

    loss = tf.reduce_mean(tf.nn.sparse_softmax_cross_entropy_with_logits(logits=logits, labels=y_batch))
    #train_ops = tf.train.AdamOptimizer(learning_rate=0.0001).minimize(loss)
    train_ops = tf.train.GradientDescentOptimizer(learning_rate=0.01).minimize(loss)

    sess = tf.Session()
    sess.run(tf.global_variables_initializer())
    for s in range(t_steps):
        _, loss_batch = sess.run([train_ops, loss])

        print("steps {:05d} loss {:03f}".format(s, loss_batch))


print("done !")

steps 00000 loss 6.914634
steps 00001 loss 6.907555
steps 00002 loss 6.905149
steps 00003 loss 6.905774
steps 00004 loss 6.904990

Answer 1 · 2019-06-09T17:17:07.000Z

You lost parameters of batch normalization layer. @weishengchong

with tf.control_dependencies(tf.get_collection(tf.GraphKeys.UPDATE_OPS)):
    train_ops = optimizer.minimize(loss)

Answer 2 · 2019-06-10T04:21:23.000Z

Hi @frotms ,
Thanks for your advise, it works.

I also use correction below to input constant random number rather than tf.random.uniform that always updates its value during training.

        x_batch  = tf.constant(np.random.uniform(low=0, high=1.0, size=(t_batch, 224, 224,3)).astype(np.float32))
        y_batch = tf.constant(np.random.uniform(size=(t_batch,), low=0, high=num_classes).astype(np.int32))