Question for the weighting parameter
nlkim0817 opened this issue · 0 comments
nlkim0817 commented
First of all, thank you for sharing your valuable code.
As the following provided code, I want to know the meaning of the multiplier 6
after each weighting parameters (7e-5
and 4e5
). I could not find any details about it in the paper.
kd_feat_loss += dist2(t_feats[_i], self.adaptation_layers[_i](x[_i]), attention_mask=sum_attention_mask,
channel_attention_mask=c_sum_attention_mask) * 7e-5 * 6
kd_channel_loss += torch.dist(torch.mean(t_feats[_i], [2, 3]),
self.channel_wise_adaptation[_i](torch.mean(x[_i], [2, 3]))) * 4e-3 * 6