Pinned Repositories
neural-speed
An innovative library for efficient LLM inference via low-bit quantization
xetla
openvino.genai
Run Generative AI models using native OpenVINO C++ API
neural-speed
An innovation library for efficient LLM inference via low-bit quantization and sparsity
xetla
parvizmp's Repositories
parvizmp/neural-speed
An innovation library for efficient LLM inference via low-bit quantization and sparsity
parvizmp/xetla