nf4
There are 1 repositories under nf4 topic.
intel/neural-speed
An innovative library for efficient LLM inference via low-bit quantization
There are 1 repositories under nf4 topic.
An innovative library for efficient LLM inference via low-bit quantization