model-compression

There are 310 repositories under model-compression topic.

microsoft/nni
An open source AutoML toolkit for automate machine learning lifecycle, including feature engineering, neural architecture search, model compression and hyper-parameter tuning.
Language:Python14.3k 283 2.1k1.8k
huawei-noah/Efficient-AI-Backbones
Efficient AI Backbones including GhostNet, TNT and MLP, developed by Huawei Noah's Ark Lab.
Language:Python4.3k 53 285731
dkozlov/awesome-knowledge-distillation
Awesome Knowledge Distillation
3.7k 112 9510
huawei-noah/Pretrained-Language-Model
Pretrained language model and its related optimization techniques developed by Huawei Noah's Ark Lab.
Language:Python3.1k 56 201643
VainF/Torch-Pruning
[CVPR 2023] DepGraph: Towards Any Structural Pruning; LLMs, Vision Foundation Models, etc.
Language:Python3.1k 34 427363
Tencent/PocketFlow
An Automatic Model Compression (AutoMC) framework for developing smaller and faster AI applications.
Language:Python2.9k 145 275495
FLHonker/Awesome-Knowledge-Distillation
Awesome Knowledge-Distillation. 分类整理的知识蒸馏paper(2014-2021)。
2.6k 61 12337
he-y/Awesome-Pruning
A curated list of neural network pruning resources.
2.5k 89 28331
666DZY666/micronet
micronet, a model compression and deploy lib. compression: 1、quantization: quantization-aware-training(QAT), High-Bit(>2b)(DoReFa/Quantization and Training of Neural Networks for Efficient Integer-Arithmetic-Only Inference)、Low-Bit(≤2b)/Ternary and Binary(TWN/BNN/XNOR-Net); post-training-quantization(PTQ), 8-bit(tensorrt); 2、 pruning: normal、regular and group convolutional channel pruning; 3、 group convolution structure; 4、batch-normalization fuse for quantization. deploy: tensorrt, fp32/fp16/int8(ptq-calibration)、op-adapt(upsample)、dynamic_shape
Language:Python2.3k 41 110478
Efficient-ML/Awesome-Model-Quantization
A list of papers, docs, codes about model quantization. This repo is aimed to provide the info for model quantization research, we are continuously improving the project. Welcome to PR the works (papers, repositories) that are missed by the repo.
2.2k 65 13227
haitongli/knowledge-distillation-pytorch
A PyTorch implementation for exploring deep and shallow knowledge distillation (KD) experiments with flexibility
Language:Python2k 20 25350
AberHu/Knowledge-Distillation-Zoo
Pytorch implementation of various Knowledge Distillation (KD) methods.
Language:Python1.7k 23 17268
tensorflow/model-optimization
A toolkit to optimize ML models for deployment for Keras and TensorFlow, including quantization and pruning.
Language:Python1.6k 117 362331
microsoft/NeuronBlocks
NLP DNN Toolkit - Building Your NLP DNN Models Like Playing Lego
Language:Python1.5k 62 27193
huawei-noah/Efficient-Computing
Efficient computing methods developed by Huawei Noah's Ark Lab
Language:Jupyter Notebook1.3k 23 144219
ethanhe42/channel-pruning
Channel Pruning for Accelerating Very Deep Neural Networks (ICCV'17)
Language:Python1.1k 47 125311
MingSun-Tse/Efficient-Deep-Learning
Collection of recent methods on (deep) neural network compression and acceleration.
952 52 1133
horseee/DeepCache
[CVPR 2024] DeepCache: Accelerating Diffusion Models for Free
Language:Python927 14 5443
guan-yuan/Awesome-AutoML-and-Lightweight-Models
A list of high-quality (newest) AutoML works and lightweight models including 1.) Neural Architecture Search, 2.) Lightweight Structures, 3.) Model Compression, Quantization and Acceleration, 4.) Hyperparameter Optimization, 5.) Automated Feature Engineering.
853 55 0160
alibaba/TinyNeuralNetwork
TinyNeuralNetwork is an efficient and easy-to-use deep learning model compression framework.
Language:Python846 20 152128
lhyfst/knowledge-distillation-papers
knowledge distillation papers
759 35 486
Zhen-Dong/Awesome-Quantization-Papers
List of papers related to neural network quantization in recent AI conferences and journals.
717 16 259
SqueezeAILab/SqueezeLLM
[ICML 2024] SqueezeLLM: Dense-and-Sparse Quantization
Language:Python703 18 2849
cnkuangshi/LightCTR
Lightweight and Scalable framework that combines mainstream algorithms of Click-Through-Rate prediction based computational DAG, philosophy of Parameter Server and Ring-AllReduce collective communication.
Language:C++673 59 10141
SforAiDl/KD_Lib
A Pytorch Knowledge Distillation library for benchmarking and extending works in the domains of Knowledge Distillation, Pruning, and Quantization.
Language:Python644 14 6362
he-y/filter-pruning-geometric-median
Filter Pruning via Geometric Median for Deep Convolutional Neural Networks Acceleration (CVPR 2019 Oral)
Language:Python610 7 77114
cedrickchee/awesome-ml-model-compression
Awesome machine learning model compression research papers, quantization, tools, and learning material.
537 21 061
iamhankai/ghostnet.pytorch
[CVPR2020] GhostNet: More Features from Cheap Operations
Language:Python536 14 38116
microsoft/archai
Accelerate your Neural Architecture Search (NAS) through fast, reproducible and modular research.
Language:Python480 26 3390
Zhen-Dong/HAWQ
Quantization library for PyTorch. Support low-precision and mixed-precision quantization, with hardware implementation through TVM.
Language:Python447 14 4282
mit-han-lab/amc
[ECCV 2018] AMC: AutoML for Model Compression and Acceleration on Mobile Devices
Language:Python445 16 25114
1duo/awesome-ai-infrastructures
Infrastructures™ for Machine Learning Training/Inference in Production.
426 37 073
chester256/Model-Compression-Papers
Papers for deep neural network compression and acceleration
401 20 080
pratyushasharma/laser
The Truth Is In There: Improving Reasoning in Language Models with Layer-Selective Rank Reduction
Language:Python388 23 2232
he-y/soft-filter-pruning
Soft Filter Pruning for Accelerating Deep Convolutional Neural Networks
Language:Python382 9 3773
SqueezeAILab/KVQuant
[NeurIPS 2024] KVQuant: Towards 10 Million Context Length LLM Inference with KV Cache Quantization
Language:Python381 11 1837

model-compression

microsoft/nni

huawei-noah/Efficient-AI-Backbones

dkozlov/awesome-knowledge-distillation

huawei-noah/Pretrained-Language-Model

VainF/Torch-Pruning

Tencent/PocketFlow

FLHonker/Awesome-Knowledge-Distillation

he-y/Awesome-Pruning

666DZY666/micronet

Efficient-ML/Awesome-Model-Quantization

haitongli/knowledge-distillation-pytorch

AberHu/Knowledge-Distillation-Zoo

tensorflow/model-optimization

microsoft/NeuronBlocks

huawei-noah/Efficient-Computing

ethanhe42/channel-pruning

MingSun-Tse/Efficient-Deep-Learning

horseee/DeepCache

guan-yuan/Awesome-AutoML-and-Lightweight-Models

alibaba/TinyNeuralNetwork

lhyfst/knowledge-distillation-papers

Zhen-Dong/Awesome-Quantization-Papers

SqueezeAILab/SqueezeLLM

cnkuangshi/LightCTR

SforAiDl/KD_Lib

he-y/filter-pruning-geometric-median

cedrickchee/awesome-ml-model-compression

iamhankai/ghostnet.pytorch

microsoft/archai

Zhen-Dong/HAWQ

mit-han-lab/amc

1duo/awesome-ai-infrastructures

chester256/Model-Compression-Papers

pratyushasharma/laser

he-y/soft-filter-pruning

SqueezeAILab/KVQuant