Darshvino

Pinned Repositories

AutoGPTQ
An easy-to-use LLMs quantization package with user-friendly apis, based on GPTQ algorithm.
Language:Python4.3k 31 451461
QNNPACK
Quantized Neural Network PACKage - mobile-optimized implementation of quantized neural network operators
Language:C00
XNNPACK
High-efficiency floating-point neural network inference operators for mobile, server, and Web
Language:C0 0 00
llama.cpp
LLM inference in C/C++
Language:C++64.9k 542 3.7k9.3k

Darshvino/QNNPACK
Quantized Neural Network PACKage - mobile-optimized implementation of quantized neural network operators
Language:C00
Darshvino/XNNPACK
High-efficiency floating-point neural network inference operators for mobile, server, and Web
Language:C0 0 00