A fast inference library for running LLMs locally on modern consumer-class GPUs
Primary LanguagePythonMIT LicenseMIT
No issues in this repository yet.