dmlc/gluon-nlp

[Enhancement] Add sequence-level distillation to NMT training

sxjscience opened this issue · 0 comments

Description

Add the sequence-level distillation to NMT training. This means, we draw samples from the teacher model with beam-search and train the student model with the generated samples.

References