google/next-prediction

exp mask

Closed this issue · 1 comments

the 'exp_mask' defined in 'models.py' is to mask the useless features, then return the vector as the input of softmax fuction.
image

suppose val is 1, mask is 0 or 1, then how to make a difference between
$ (1+0) * -1e30 $ and $ (1+1) * -1e30 $ ,

Looking forward to the answer !

Doesn't matter what val is actually. And the multiply is before tf.add. Think about this case: val is a vector of size 3, [2.4, -5.0, 0.23], and mask is [1, 1, 0] (we use 0 to mark unused positions). so we have (val + [0, 0, -1e30]), so the last dimension of val's would become a very negative number, no mater what its original value is (unless it is a very positive number, which is unlikely).