akshitac8/BiAM

Why multi-headed self-attention?

hugh920 opened this issue · 1 comments

Dear author, thanks for your papper and code.However,I've had a problem for a long time.Why the multi-headed attention were used in RCB?Can we not use multi-head? Like just using the normal self-attention mechanism in RCB.Looking forward to your reply.Thank you so much.

Hello @hugh920 Thank you for your interest in our work. We wanted to use an attention module that help us in capturing effective region based features. we have also added results with other attention blocks in the paper to show that are proposed module works best for our setting.