/streaming-llm

Efficient Streaming Language Models with Attention Sinks

Primary LanguagePythonMIT LicenseMIT

No issues in this repository yet.