Neural Magic
Neural Magic (Acquired by Red Hat) empowers developers to optimize & deploy LLMs at scale. Our model compression & acceleration enable top performance with vLLM
Boston
Pinned Repositories
AutoFP8
deepsparse
Sparsity-aware deep learning inference runtime for CPUs
docs
Top-level directory for documentation and general content
examples
Notebooks using the Neural Magic libraries 📓
nm-vllm
A high-throughput and memory-efficient inference and serving engine for LLMs
sparseml
Libraries for applying sparsification recipes to neural networks with a few lines of code, enabling faster and smaller models
sparsezoo
Neural network model repository for highly sparse and sparse-quantized models with matching sparsification recipes
sparsify
ML model optimization product to accelerate inference.
vllm
A high-throughput and memory-efficient inference and serving engine for LLMs
yolov5
YOLOv5 in PyTorch > ONNX > CoreML > TFLite
Neural Magic's Repositories
neuralmagic/sparsify
ML model optimization product to accelerate inference.
neuralmagic/docs
Top-level directory for documentation and general content
neuralmagic/helm-charts
Helm charts for deploying NM VLLM
neuralmagic/mlperf_inference_results_v2.1
neuralmagic/optimum-deepsparse
neuralmagic/pytorch-image-models
PyTorch image models, scripts, pretrained weights -- ResNet, ResNeXT, EfficientNet, EfficientNetV2, NFNet, Vision Transformer, MixNet, MobileNet-V3/V2, RegNet, DPN, CSPNet, and more
neuralmagic/yolact
A simple, fully convolutional model for real-time instance segmentation.
neuralmagic/datasets
🤗 The largest hub of ready-to-use datasets for ML models with fast, easy-to-use and efficient data manipulation tools
neuralmagic/yolov3
YOLOv3 in PyTorch > ONNX > CoreML > TFLite
neuralmagic/band_of_the_hawk
Hackathon 2022
neuralmagic/CLIP_benchmark
CLIP-like model evaluation
neuralmagic/lm-evaluation-harness-archive
A framework for few-shot evaluation of autoregressive language models.
neuralmagic/sahi
Framework agnostic sliced/tiled inference + interactive ui + error analysis plots
neuralmagic/aws-do-eks
neuralmagic/bigcode-evaluation-harness
A framework for the evaluation of autoregressive code generation language models.
neuralmagic/deepsparse-digitalocean-image
Repo for building and packaging a 1-click app for DigitalOcean
neuralmagic/hackathon_2024
woop wooop
neuralmagic/langchain
⚡ Building applications with LLMs through composability ⚡
neuralmagic/mlperf_inference_results_v3.0
neuralmagic/nm-AutoGPTQ
An easy-to-use LLMs quantization package with user-friendly apis, based on GPTQ algorithm.
neuralmagic/nm-docker
Neural Magic Docker
neuralmagic/tensorrt-demo
neuralmagic/to-be-removed-llm-foundry
LLM training code for MosaicML foundation models
neuralmagic/vllm-benchmarking
Benchmarking Repo for vLLM
neuralmagic/vllm-server-benchmark
Simple benchmarking utility for vLLM Server