romsto/Speculative-Decoding
Implementation of the paper Fast Inference from Transformers via Speculative Decoding, Leviathan et al. 2023.
PythonMIT
Issues
- 4
Generation using cache gives weird sentences
#1 opened by romsto
Implementation of the paper Fast Inference from Transformers via Speculative Decoding, Leviathan et al. 2023.
PythonMIT