HobbitLong/RepDistiller

[ICLR 2020] Contrastive Representation Distillation (CRD), and benchmark of recent knowledge distillation methods

PythonBSD-2-Clause

Issues

test
#42 opened a month ago by Xinxinatg
2
How do you choose the optimal hyper-parameters?
#20 opened 4 years ago by JinYang88
2
about using the resnet models for cifar10
#48 opened 2 years ago by EmnaGuermazi97
1
Ensemble Task Implementation
#50 opened 2 years ago by sdsawtelle
2
the result is different in resnet56
#56 opened 2 years ago by SuWideSun
1
Failed to download the teacher models
#47 opened 2 years ago by Prisoneryc
2
No dev set split
#59 opened a year ago by guzy0324
0
ERROR :run ./fetch_pretrained_teachers.sh
#57 opened a year ago by DPingWu
1
How to use myself datasets?
#58 opened a year ago by 1997Jessie
1
Is Ensemble distillation also included?
#54 opened 2 years ago by YanjingLiLi
0
Hyperparameter Settings for KD on Imagenet
#53 opened 2 years ago by Calmepro777
0
Why using log_softmax instead of softmax?
#52 opened 2 years ago by nguyenvulong
1
Question about normalization constant Z_v1 and Z_v2 in the ContrastMemory
#51 opened 2 years ago by YujieZheng99
0
crd used in image enhancement task like Denoise\SR\Deblur.
#49 opened 2 years ago by YangGangZhiQi
0
resnet structure seems to be a bit wrong
#46 opened 3 years ago by surprisedong
3
Problem of the order of the normalization in Similarity-Preserving loss.
#45 opened 3 years ago by seacj
0
Cross modal KD implementation release?
#39 opened 4 years ago by liu115
1
Question on memory consumption for CRD loss when the dataset is very large
#40 opened 3 years ago by TMaysGGS
3
AttributeError: 'CIFAR100InstanceSample' object has no attribute 'train_data'
#38 opened 4 years ago by Jiawen-huang
3
Training scheme for linear probe on STL10 and TinyImagenet
#44 opened 3 years ago by 4m4n5
0
Error while running the code
#43 opened 3 years ago by frestuc
0
Why "opt.nce_k" in dataset cifar100 is 16384? How can I get this ?
#41 opened 3 years ago by MuHeDing
2
Hyper-parameters for reproducing the results on ImageNet
#36 opened 4 years ago by kumamonatseu
9
what is the difference between the position of putting "with torch.no_grad()"
#37 opened 4 years ago by ChriswooTalent
1
how to train my model?
#35 opened 4 years ago by 972461099
0
About deep mutual learning setting
#34 opened 4 years ago by swlzq
0
About the CE loss
#33 opened 4 years ago by XiXiRuPan
0
ImageNet results
#32 opened 4 years ago by senya-ashukha
0
How to train teacher model
#31 opened 4 years ago by tiancity-NJU
0
teacher model is too big to run with batch_size 64
#30 opened 4 years ago by tiancity-NJU
0
Question about pretrained teacher model
#29 opened 4 years ago by MaorunZhang
0
hyperparameters for other methods
#25 opened 4 years ago by wukailu
1
the introduction of ContrastMemory
#27 opened 4 years ago by sanshanxiashi
1
Multiple GPU training
#28 opened 4 years ago by deropty
0
Reported results based on early stopping?
#16 opened 5 years ago by VladimirLi
3
KD method in both configurations seems to be doing better than all other methods except the one from your paper
#26 opened 4 years ago by ksachdeva
1
questions about ContrastMemory
#24 opened 4 years ago by jianxiangm
3
AttributeError: 'CIFAR100Instance' object has no attribute 'train_data'
#23 opened 4 years ago by Yejing-Lai
0
How can I use CRD_loss to face landmark detetct for model compression? There is no "opt.nce_k: number of negatives paired with each positive".
#22 opened 4 years ago by gjd2017
0
The calculation of correlation matrix
#21 opened 4 years ago by winycg
0
code for ensemble distillation
#17 opened 4 years ago by tonmoy-saikia
1
Does the crd can be applied to cross domain distillation
#19 opened 4 years ago by Doraemonzm
2
The sampler is not consistent with the original implementation of CCKD
#18 opened 4 years ago by winycg
1
the implementation of cckd is not consistent with the paper
#15 opened 5 years ago by xiaojieli0903
2
Form of the h function for infinite dataset
#13 opened 5 years ago by brotherofken
2
Memory issue about the NST LOSS
#12 opened 5 years ago by leoozy
2
Questions about ContrastMemory
#11 opened 5 years ago by HelloTobe
5
Results on ImageNet
#10 opened 5 years ago by xuguodong03
1
In the 2 result tables, WRN-40-2, as the teacher, after distilling the students, the students get higher performance（CRD+KD）, why?
#9 opened 5 years ago by splinter21
1
Regression task
#8 opened 5 years ago by xjcvip007
1