Issues
- 0
Error in Documentation
#150 opened by msalexms - 0
Relational KD
#147 opened by gadhane - 1
Is there a suitable speech enhancement ?
#144 opened by zuowanbushiwo - 8
No module named 'KD_Lib.KD'
#143 opened by tolusophy - 1
Can I skip training the teacher network?
#142 opened by Haalum - 0
Test BERT2LSTM with mock data
#137 opened by NeelayS - 2
Create 'main' branch and set it as default
#132 opened by NeelayS - 0
Use mock data for unit tests
#131 opened by NeelayS - 2
Consider potential name change to 'kdlib'
#133 opened by NeelayS - 0
distillation of gelectra model
#130 opened by OriAlpha - 2
Issue with CUDA
#129 opened by OriAlpha - 4
custom dataloader for NLP dataset
#128 opened by OriAlpha - 2
- 4
Paper: Subclass Distillation
#54 opened by NeelayS - 7
RuntimeError: only batches of spatial targets supported (3D tensors) but got targets of dimension: 4
#122 opened by GeneralJing - 1
NameError: name 'best_student_id' is not defined
#116 opened by PiaCuk - 1
- 3
Pip install "stable" doesn't work
#108 opened by BrunoGomesCoelho - 0
Benchmarking KD
#105 opened by avishreekh - 3
- 5
Distributed Training
#79 opened by Het-Shah - 0
[Paper] Learning Deep Representations with Probabilistic Knowledge Transfer
#46 opened by khizirsiddiqui - 4
- 3
Restructuring KD_Lib
#61 opened by NeelayS - 0
- 0
Benchmarking Pruning and Quantization
#106 opened by avishreekh - 14
import error
#104 opened by samo313 - 2
DML Loss function
#102 opened by aryanasadianuoit - 14
Update setup.py
#82 opened by NeelayS - 0
[Paper] Regularizing Class-wise Predictions via Self-knowledge Distillation
#93 opened by ashwinvaswani - 2
Refix and Benchmarking
#56 opened by Het-Shah - 1
Very minor bug in TAKD
#92 opened by ashwinvaswani - 1
- 0
Evaluators missing in models
#91 opened by ashwinvaswani - 0
Quantization logs
#87 opened by avishreekh - 1
Quantization Aware Training
#81 opened by Het-Shah - 0
- 2
Dynamic Quantization
#72 opened by avishreekh - 3
Restructure text-based models and utilities
#57 opened by avishreekh - 0
Paper: The Lottery Ticket Hypothesis: Finding Sparse, Trainable Neural Networks
#50 opened by avishreekh - 4
Paper: Deep Mutual Learning
#49 opened by NeelayS - 0
Paper: Distilling Task-Specific Knowledge from BERT into Simple Neural Networks
#38 opened by avishreekh - 0
Paper: Improving Generalization Robustness with Noisy Collaboration in Knowledge Distillation
#47 opened by NeelayS - 0
Paper: Preparing Lessons: Improve Knowledge Distillation with Better Supervision
#44 opened by NeelayS - 0
Born Again Neural Networks
#42 opened by khizirsiddiqui - 0
- 1
- 0
- 2
Device parameter of Base class
#33 opened by avishreekh - 2
Documentation
#28 opened by khizirsiddiqui