elephantmipt/bert-distillation

Is it possible to distill an ensemble network based on Roberta?

Opened this issue · 0 comments

I have an ensemble of 2 roberta models. Can that be supported by this framework?