Do you consider the mask problem for attention?

Question

Do you consider the mask problem for attention?

Closed this issue 5 years ago · 1 comments

Answer 1 · 2019-10-06T04:29:37.000Z

Thank you for your interests in our work.
First, It is not clear that the attention you mentioned is REFER module or FIND module.
In REFER module, we don't need to consider the mask because we just make the REFER module compute the multi-head attention between the question and dialog history.
In FIND module, we also didn't employ the mask in our published work. But, we newly implemented the visual mask for the researchers who will reuse this code. You can check the details in encoders/modules.py file.