Problems about the architecture of Attention.
QJ-Chen opened this issue · 0 comments
QJ-Chen commented
In model.attention
AttentionModule1.shortcut_short is not used. You calculate the shortcut with the downsample weights.
shortcut_short = self.soft_resdown3(x_s)
AttentionModule3.shortcut_short is unnecessary.