fix 0 logits in input module
Closed this issue · 1 comments
Currently, because the nil embedding is 0 (which is fine) and which we pad to a specified memory size, we tend to have a bunch of memories which are empty [0 0 ... 0]. The problem with this is we feed this into a softmax as is and exp(0) = 1. On the output the empty memories have a uniform probability. This is problematic because it alters the probabilities of non-empty memories.
So the solution is to add a largish negative number to empty memories before sotfmax is applied. Then the exp() of the value will be 0 or close enough.
This issue is particularly evident in task 4 where each story consists of 2 sentences. If we make the memory size large, say 50 (only 2 is needed) 2 things tend to occur:
- We converge at a much slower rate
- We get a worse error rate
An alternative solution would be make all batch-size 1 (at least at a low level, higher level API can make this nicer). This way the memory can be of any size since nothing in the underlying algorithm relies on the memory being a fixed size (at least I think this is the case, have to double check!).
Fixed. Wen't with the largish negative number idea. It makes a big difference.