Issues
- 3
Duplicated softmax on layer weights?
#5 opened by LorrinWWW - 3
RuntimeError: The size of tensor a (17) must match the size of tensor b (18) at non-singleton dimension 1
#4 opened by marma1ade - 1
General Distill Stage?
#9 opened by lsyysl9711 - 1
I can't reproduce the paper's result.
#10 opened by cdp-study - 3
args.embedding_emd 参数没有定义
#2 opened by laomagic