Infini-AI-Lab/Sequoia

scalable and robust tree-based speculative decoding algorithm

Python

Issues

Work On CPU
#16 opened 6 months ago by ZepinLi
0
Reproducibility: the tree_search generates too small tree
#13 opened 6 months ago by KexinFeng
8
Estimate the number of generated tokens per step from the acceptance-rate-vector?
#14 opened 6 months ago by KexinFeng
1
Question on tree search algorithm
#15 opened 6 months ago by cyLi-Tiger
3
Is there any benchmark that compares Sequoia against vanilla speculative decoding?
#10 opened 6 months ago by KexinFeng
2
How to benchmark for speedup and acceptance rate?
#12 opened 7 months ago by singularity-s0
7
The support on vLLM?
#11 opened 7 months ago by KexinFeng
1
Thanks for your good work.
#9 opened 7 months ago by xwjim
0
data loading timing and disk use
#4 opened 8 months ago by poedator
0
Integration with Lit-GPT
#3 opened 8 months ago by tchaton
2
Tensor shape mismatch when computing apply_rotary_pos_emb
#2 opened 8 months ago by Tomorrowdawn
5
Error `p.attn_bias_ptr is not correctly aligned` when testing
#1 opened 8 months ago by poedator
1