Equationliu/Kangaroo

Implementation of Kangaroo: Lossless Self-Speculative Decoding via Double Early Exiting

Python

Issues

Kangaroo when bsz is greater than 1.
#6 opened 2 months ago by cool-xiang
2
Encountering NaN output at a specific batch ID every run, and no change observed upon adjusting the learning rate
#5 opened 3 months ago by Zerohclmax
0
a question
#4 opened 4 months ago by cool-xiang
2
In line 263 of train.py， predict = model(inputs_embeds=data["hidden_states_early"]
#3 opened 4 months ago by cool-xiang
1
Training procedure of Kangaroo.
#2 opened 4 months ago by tim-pan
5
why warmup when evaluating
#1 opened 6 months ago by EganGu
2