zhewang1-intc/auto-round
SOTA Weight-only Quantization Algorithm for LLMs. This is official implementation of "Optimize Weight Rounding via Signed Gradient Descent for the Quantization of LLMs"
PythonApache-2.0
No issues in this repository yet.
SOTA Weight-only Quantization Algorithm for LLMs. This is official implementation of "Optimize Weight Rounding via Signed Gradient Descent for the Quantization of LLMs"
PythonApache-2.0
No issues in this repository yet.