/stagedspeculation

Staged speculative decoding for small-batch LLM inference

Primary LanguagePythonMIT LicenseMIT