/TALIS

Simple and fast server for GPTQ-quantized LLaMA inference

Primary LanguagePython

No issues in this repository yet.