ArchipLab-LinfengZhang/Object-Detection-Knowledge-Distillation-ICLR2021

Question for the weighting parameter

nlkim0817 opened this issue · 0 comments

First of all, thank you for sharing your valuable code.
As the following provided code, I want to know the meaning of the multiplier 6 after each weighting parameters (7e-5 and 4e5). I could not find any details about it in the paper.

                kd_feat_loss += dist2(t_feats[_i], self.adaptation_layers[_i](x[_i]), attention_mask=sum_attention_mask,
                                      channel_attention_mask=c_sum_attention_mask) * 7e-5 * 6
                kd_channel_loss += torch.dist(torch.mean(t_feats[_i], [2, 3]),
                                              self.channel_wise_adaptation[_i](torch.mean(x[_i], [2, 3]))) * 4e-3 * 6