Equationliu/Kangaroo
Implementation of Kangaroo: Lossless Self-Speculative Decoding via Double Early Exiting
Python
Issues
- 2
Kangaroo when bsz is greater than 1.
#6 opened by cool-xiang - 0
Encountering NaN output at a specific batch ID every run, and no change observed upon adjusting the learning rate
#5 opened by Zerohclmax - 2
a question
#4 opened by cool-xiang - 1
In line 263 of train.py, predict = model(inputs_embeds=data["hidden_states_early"]
#3 opened by cool-xiang - 5
Training procedure of Kangaroo.
#2 opened by tim-pan - 2
why warmup when evaluating
#1 opened by EganGu