initzhang/hydragen-attention

An implementation of the core attention algorithm in the paper "Hydragen: High-Throughput LLM Inference with Shared Prefixes".

Python

Readme
0Issues
0Stargazers
0Watchers

No issues in this repository yet.

Contact site admin: Geeks.