yunjey/show-attend-and-tell

the regularization term in the loss

congve1 opened this issue · 1 comments

when I look into to the code about the regularization term in the loss, I'm wondering whether it should use mask as well.
In the code, we get alphas = tf.transpose(tf.stack(alpha_list), (1, 0, 2)) # (N, T, L). But captions lengths vary. So maybe we should times masks to make sure that the <NULL> doesn't make contributions to the loss.

I've found a same issue in the closed issue