the regularization term in the loss

Question

the regularization term in the loss

congve1 opened this issue 6 years ago · 1 comments

when I look into to the code about the regularization term in the loss, I'm wondering whether it should use mask as well.
In the code, we get alphas = tf.transpose(tf.stack(alpha_list), (1, 0, 2)) # (N, T, L). But captions lengths vary. So maybe we should times masks to make sure that the <NULL> doesn't make contributions to the loss.

Answer 1 · 2018-05-16T18:49:02.000Z

I've found a same issue in the closed issue