/TALIS

Simple and fast server for GPTQ-quantized LLaMA inference

Primary LanguagePython

Watchers