AIVResearch/MSANet

About attention module

1173206772 opened this issue · 4 comments

Thanks for your great work.
You design the attention module ,in this module, it gets the the attention vector Va and then you simpliy do a Hadamard product with the feature map to get a attention map Ma.Right?
I don't understand the difference between the Va obtained in this way and the support protocal vector , in other words, can I easily use the protocal vector instead of your attention vector Va to get the attention map by Hadamard product with the feature map F^23?
Can you explain it to me?

Best wishes.

I appreciate your interest in our work.
We include an attention module to instruct the model to focus more on the class-relevant information. To find the attention feature(only class relevant) we first find Va(masked support feature following Conv layer to get same resolution representing only class-relevant information). We map the produced Va on F^23 to get the final attention feature. In the prototype vector, the masked support feature is squeezed to one dimension using MAP operation. I hope that answers your question
You can replace Va with the support prototype vector(make it first to same resolution) to find the attention feature and check the performance difference.

Sorry, I didn't notice your reply before.
You said Va(masked support feature following Conv layer to get same resolution representing only class-relevant information),but by the code att = F.adaptive_avg_pool2d(self.mask(Fs, Ys), output_size=(1, 1)) we will get the tensor with shape [B,C,1,1],right? In this way , how can I get the same resolution representing only class-relevant information?

Thank you for the clarification. Yes, you are right the average pooling output is (1,1), which produces the same resolution of attention feature after multiplying with F^23. following forward function.

def forward(self, *x):

You can try to use prototype vector and can check the performance difference. Thank you

OK, thanks for your reply.