Make the scores more tolerant for subword unit parts

Question

Make the scores more tolerant for subword unit parts

M4t1ss opened this issue 7 years ago · 0 comments

Penalize tokens ending with @@ less for having attention aligned to multiple other tokens.
...or maybe concatenate the attention matrix to word-level?