anthony-intel/neural-speed
An innovative library for efficient LLM inference via low-bit quantization
C++Apache-2.0
Watchers
No one’s watching this repository yet.
An innovative library for efficient LLM inference via low-bit quantization
C++Apache-2.0
No one’s watching this repository yet.