About the code before softmax in CAM_Module
kaneyxx opened this issue · 2 comments
Hi, thanks for sharing this awesome project :)
Here's a question while reading your source code. In CAM_Module, there is one line code before softmax function. It doesn't exist in PAM_Module. According to my understanding, it means you use the maximum value (which calculated by query dot key) each channel vector to minus every value respectively. But...it will be the larger number equals the more irrelevant channel, right?
Sorry... I cannot understand this, could you kindly explain it for me? Thanks a lot!
Hi @kaneyxx
This is added to prevent loss divergence during training. The CAM module is borrowed from DANet. Our understanding from that line is that it will enforce to pay more attention to more dissimilar channels.
Best
It makes sense for me, and I will go for DANet paper after this. Thank you!