Speculative Decoding slower (only in ExUI)
SinanAkkoyun opened this issue · 3 comments
SinanAkkoyun commented
SinanAkkoyun commented
Also, the UI is sooo nice, thank you for the cool work!
SinanAkkoyun commented
Hm, it could be that the draft model does not get loaded with RoPE scale of 4
Or is the scale irrelevant, the alpha is all that counts and gets transferred?
SinanAkkoyun commented
Yes that was it
#21