Fast inference from large lauguage models via speculative decoding
Primary LanguagePython
No issues in this repository yet.