QQQ is an innovative and hardware-optimized W4A8 quantization solution for LLMs.
Primary LanguagePython