Neural Magic
Neural Magic (Acquired by Red Hat) empowers developers to optimize & deploy LLMs at scale. Our model compression & acceleration enable top performance with vLLM
Boston
Pinned Repositories
AutoFP8
deepsparse
Sparsity-aware deep learning inference runtime for CPUs
docs
Top-level directory for documentation and general content
examples
Notebooks using the Neural Magic libraries 📓
nm-vllm
A high-throughput and memory-efficient inference and serving engine for LLMs
sparseml
Libraries for applying sparsification recipes to neural networks with a few lines of code, enabling faster and smaller models
sparsezoo
Neural network model repository for highly sparse and sparse-quantized models with matching sparsification recipes
sparsify
ML model optimization product to accelerate inference.
vllm
A high-throughput and memory-efficient inference and serving engine for LLMs
yolov5
YOLOv5 in PyTorch > ONNX > CoreML > TFLite
Neural Magic's Repositories
neuralmagic/pytorch-image-models
PyTorch image models, scripts, pretrained weights -- ResNet, ResNeXT, EfficientNet, EfficientNetV2, NFNet, Vision Transformer, MixNet, MobileNet-V3/V2, RegNet, DPN, CSPNet, and more
neuralmagic/datasets
🤗 The largest hub of ready-to-use datasets for ML models with fast, easy-to-use and efficient data manipulation tools