A question about ablation studies in Table 2.

Question

A question about ablation studies in Table 2.

Closed this issue 2 years ago · 2 comments

Hi,

I have a question about the ablation studies in Table 2.

Why can a single subnet, e.g. MSDN (A->V) w/ L_distill, be trained with semantic distillation loss that calculates the difference between two subnet outputs?

Looking forward to your reply soon.

Answer 1 · 2023-02-23T04:59:38.000Z

Hi, @bad-meets-joke

In Table 2, MSDN(A->V) w/L_distill means that the MSDN trained with L_distill loss, but we only take the features in A->V branch for classification. Please refer to the related descirptions in the paper.

Best!

Answer 2 · 2023-02-23T05:26:16.000Z

Thanks for your reply. Now I get it.