zhewang1-intc/auto-round
SOTA Weight-only Quantization Algorithm for LLMs. This is official implementation of "Optimize Weight Rounding via Signed Gradient Descent for the Quantization of LLMs"
PythonApache-2.0
Stargazers
No one’s star this repository yet.