intel/neural-compressor
SOTA low-bit LLM quantization (INT8/FP8/INT4/FP4/NF4) & sparsity; leading model compression techniques on TensorFlow, PyTorch, and ONNX Runtime
PythonApache-2.0
Watchers
- 00mjk
- a2un@IITH
- AvenSunCA
- beoy
- bradjonescaNew York
- cetiumBeijing ,China
- ChendaLi-Intel
- chensuyueIntel
- drkostasUniversity of Tennessee, Knoxville
- eemailme
- facedetector
- ftian1Intel
- ghchris2021
- hshen14
- jhcloos
- jnulzlGuangZhou China
- JohnnyOpcodeToronto, Ontario, Canada
- kgidneyseal software
- LeonLv
- liuguoyou
- macsz@IntelAI
- mengniwang95
- michalwolsNew York
- mingxiaohIntel
- NeoZhangJianyuIntel
- nunofernandes-plightPhotonics Precision Technologies, The Intelligence of Information & FasterCapital
- ozfSoftware Square, Byte Town, Logicstate, Computronia
- PandurangaMallireddy
- prepstarrVideotron Enigmatik Europa
- qingswuCanada
- rajnishc8
- SMBgurusACME Internet Services LLC dba Born Consultants
- WilliamTambelliniRWS
- zhiqwangaxera