irfanICMLL/TorchDistiller

Is it better to combine CWD loss with other losses than just CWD loss?

Opened this issue · 0 comments

Hi!

As mentioned in README: To train a model with channel-wise distillation, GAN loss and Pixel-wise distillation.

Is it better to combine CWD loss with other losses than just CWD loss?