/QLLM

[ICLR 2024] This is the official PyTorch implementation of "QLLM: Accurate and Efficient Low-Bitwidth Quantization for Large Language Models"

Primary LanguagePythonApache License 2.0Apache-2.0

Watchers