addvin

addvin's Stars

vllm-project/vllm
A high-throughput and memory-efficient inference and serving engine for LLMs
Language:Python27.6k 225 4.6k4.1k
neuralmagic/deepsparse
Sparsity-aware deep learning inference runtime for CPUs
Language:Python3k 56 136173
neuralmagic/sparseml
Libraries for applying sparsification recipes to neural networks with a few lines of code, enabling faster and smaller models
Language:Python2k 48 205144
vllm-project/llm-compressor
Transformers-compatible library for applying various compression algorithms to LLMs for optimized deployment with vLLM
Language:Python505 13 5941
neuralmagic/sparsezoo
Neural network model repository for highly sparse and sparse-quantized models with matching sparsification recipes
Language:Python366 25 2225
neuralmagic/sparsify
ML model optimization product to accelerate inference.
Language:Python318 26 2229
neuralmagic/nm-vllm
A high-throughput and memory-efficient inference and serving engine for LLMs
Language:Python251 8 209