lizekang/ITDD

order of attention in deliberation decoder

Closed this issue · 0 comments

Was there any specific reason to apply attention over knowledge base first and then on output from the first decoder in the second decoder ?