intel/auto-round
SOTA Weight-only Quantization Algorithm for LLMs. This is official implementation of "Optimize Weight Rounding via Signed Gradient Descent for the Quantization of LLMs"
PythonApache-2.0
Stargazers
- AgainstEntropyNew York University
- anthony-intel
- atveitMicrosoft
- austinggHanzhou, China
- bibekyess
- brataoEscavador
- fbaldassarri@therealexpertai
- gyhintel
- hoagy-davis-digges
- hshen14
- hyaihjq
- JeffCarpenterCanada
- JustinLin610Beijing
- KaidDuongFTECH.AI
- lkk12014402beijing
- loretoparisi@Musixmatchdev
- lvniqiAlibaba Cloud
- m8e
- Minami-su
- MrBenzWorld
- muzhailong
- pvti
- QubitiumModelCloud.ai
- Rain19981998
- SSshuishui
- sutyum@TechnocultureResearch
- Ther-nullptrTsinghua University @eesast
- waterwoodwindChongqing, China
- WeiweiZhang1
- wenhuach21
- x-legion
- xin-li-67California, US
- xin3he
- YangWang92
- yiliu30@Intel
- zshanwei