Pinned Repositories
AutoGPTQ
An easy-to-use LLMs quantization package with user-friendly apis, based on GPTQ algorithm.
QNNPACK
Quantized Neural Network PACKage - mobile-optimized implementation of quantized neural network operators
XNNPACK
High-efficiency floating-point neural network inference operators for mobile, server, and Web
llama.cpp
LLM inference in C/C++
Darshvino's Repositories
Darshvino/QNNPACK
Quantized Neural Network PACKage - mobile-optimized implementation of quantized neural network operators
Darshvino/XNNPACK
High-efficiency floating-point neural network inference operators for mobile, server, and Web