yuanli2333/Teacher-free-Knowledge-Distillation

Knowledge Distillation: CVPR2020 Oral, Revisiting Knowledge Distillation via Label Smoothing Regularization

PythonMIT

Issues

Mismatch between Eq.9 in the paper and the code
#19 opened 4 years ago by MingSun-Tse
4
Does this method work on the detection tasks？
#35 opened 2 years ago by fmaaf
0
KD loss is zero
#33 opened 2 years ago by minato1000
0
Does this work for dataset with only two classes
#31 opened 2 years ago by wugh
0
Question about the loss function of Tf-reg KD
#24 opened 4 years ago by HowieMa
1
Torch Vision Version
#28 opened 3 years ago by Amik-TJ
0
Working with larger image size
#27 opened 3 years ago by sri9s
0
Data augmentation for Tiny-ImageNet
#23 opened 4 years ago by aryanasadianuoit
0
Difference between L_REG and LSR
#22 opened 4 years ago by real-brilliant
1
Can't download the pre-trained model
#4 opened 5 years ago by SunCherry
3
Implementation doesn't have loss_soft_regularization and loss_fn_kd for ImageNet dataset
#21 opened 4 years ago by sainatarajan
0
The baseline of ResNet18 on CIFAR100 is relatively lower
#20 opened 4 years ago by JosephChenHub
3
Have you ever try on deeper network?
#17 opened 4 years ago by JiyueWang
3
What is the difference between Born Again Network and your self-training KD method?
#18 opened 4 years ago by JiyueWang
2
How to search the best temperature and alpha
#16 opened 4 years ago by TimeBear
5
TFselftraining parameters in the paper ?
#14 opened 4 years ago by Shiro-LK
1
questions about The two Tf-KD methods
#2 opened 5 years ago by pecanjk
6
Pretrained model for student network
#15 opened 4 years ago by he-y
2
Resnet architectures is differnet from the original networks in the paper
#13 opened 4 years ago by peipei-pig
2
do you have email? I have some trouble with your code.
#11 opened 4 years ago by TimeBear
2
It just feels like "炼丹"
#12 opened 4 years ago by ykk648
1
where's the paper?
#10 opened 4 years ago by vraivon
2
a question about mobilenetv2
#8 opened 5 years ago by lansss
0
Many Bugs
#7 opened 5 years ago by SunCherry
0
why there is a 'multiplier' param in the loss funtion?
#6 opened 5 years ago by luanyunteng
2
Questions about KD loss
#5 opened 5 years ago by Paper99
5
wonder if it work on a weak and small student network
#1 opened 5 years ago by pecanjk
1
Question about KD Regularization in code
#3 opened 5 years ago by GengZ
3