speculative-decoding
There are 17 repositories under speculative-decoding topic.
intel/intel-extension-for-transformers
⚡ Build your chatbot within minutes on your favorite device; offer SOTA compression techniques for LLMs; run LLMs efficiently on Intel Platforms⚡
PygmalionAI/aphrodite-engine
Large-scale LLM inference engine
SafeAILab/EAGLE
Official Implementation of EAGLE-1 and EAGLE-2
Infini-AI-Lab/Sequoia
scalable and robust tree-based speculative decoding algorithm
Infini-AI-Lab/TriForce
[COLM 2024] TriForce: Lossless Acceleration of Long Sequence Generation with Hierarchical Speculative Decoding
kssteven418/BigLittleDecoder
[NeurIPS'23] Speculative Decoding with Big Little Decoder
hemingkx/SpecDec
Codes for our paper "Speculative Decoding: Exploiting Speculative Execution for Accelerating Seq2seq Generation" (EMNLP 2023 Findings)
mscheong01/speculative_decoding.c
minimal C implementation of speculative decoding based on llama2.c
romsto/Speculative-Decoding
Implementation of the paper Fast Inference from Transformers via Speculative Decoding, Leviathan et al. 2023.
AutonomicPerfectionist/PipeInfer
PipeInfer: Accelerating LLM Inference using Asynchronous Pipelined Speculation
pinqian77/Dynasurge
Dynasurge: Dynamic Tree Speculation for Prompt-Specific Decoding
u-hyszk/japanese-speculative-decoding
Verification of the effect of speculative decoding in Japanese.
PopoDev/BiLD
Reproducibility Project for [NeurIPS'23] Speculative Decoding with Big Little Decoder
kinshukdua/SpecDec
Some experiments aimed at increasing LLM throughput and efficiency via Speculative Decoding.
wtlow003/ngram-decoding
(Re)-implementation of "Prompt Lookup Decoding" by Apoorv Saxena, with extended ideas from LLMA Decoding.
wtlow003/speculative-sampling
Implementation of Speculative Sampling in "Accelerating Large Language Model Decoding with Speculative Sampling"
majid-daliri/DISD
Coupling without Communication and Drafter-Invariant Speculative Decoding