[ICML 2024] Break the Sequential Dependency of LLM Inference Using Lookahead Decoding
Primary LanguagePythonApache License 2.0Apache-2.0