intel/neural-speed

An innovative library for efficient LLM inference via low-bit quantization

C++Apache-2.0

Readme
47Issues
349Stargazers
8Watchers

Watchers

DDEle
eemailme
ftian1
Intel
ghchris2021
hshen14
jhcloos
JohnClaw
seware
Intel

Contact site admin: Geeks.