model-compression
There are 246 repositories under model-compression topic.
microsoft/nni
An open source AutoML toolkit for automate machine learning lifecycle, including feature engineering, neural architecture search, model compression and hyper-parameter tuning.
huawei-noah/Efficient-AI-Backbones
Efficient AI Backbones including GhostNet, TNT and MLP, developed by Huawei Noah's Ark Lab.
dkozlov/awesome-knowledge-distillation
Awesome Knowledge Distillation
huawei-noah/Pretrained-Language-Model
Pretrained language model and its related optimization techniques developed by Huawei Noah's Ark Lab.
Tencent/PocketFlow
An Automatic Model Compression (AutoMC) framework for developing smaller and faster AI applications.
FLHonker/Awesome-Knowledge-Distillation
Awesome Knowledge-Distillation. 分类整理的知识蒸馏paper(2014-2021)。
VainF/Torch-Pruning
[CVPR 2023] Towards Any Structural Pruning; LLMs / SAM / Diffusion / Transformers / YOLOv8 / CNNs
he-y/Awesome-Pruning
A curated list of neural network pruning resources.
666DZY666/micronet
micronet, a model compression and deploy lib. compression: 1、quantization: quantization-aware-training(QAT), High-Bit(>2b)(DoReFa/Quantization and Training of Neural Networks for Efficient Integer-Arithmetic-Only Inference)、Low-Bit(≤2b)/Ternary and Binary(TWN/BNN/XNOR-Net); post-training-quantization(PTQ), 8-bit(tensorrt); 2、 pruning: normal、regular and group convolutional channel pruning; 3、 group convolution structure; 4、batch-normalization fuse for quantization. deploy: tensorrt, fp32/fp16/int8(ptq-calibration)、op-adapt(upsample)、dynamic_shape
haitongli/knowledge-distillation-pytorch
A PyTorch implementation for exploring deep and shallow knowledge distillation (KD) experiments with flexibility
htqin/awesome-model-quantization
A list of papers, docs, codes about model quantization. This repo is aimed to provide the info for model quantization research, we are continuously improving the project. Welcome to PR the works (papers, repositories) that are missed by the repo.
AberHu/Knowledge-Distillation-Zoo
Pytorch implementation of various Knowledge Distillation (KD) methods.
tensorflow/model-optimization
A toolkit to optimize ML models for deployment for Keras and TensorFlow, including quantization and pruning.
microsoft/NeuronBlocks
NLP DNN Toolkit - Building Your NLP DNN Models Like Playing Lego
huawei-noah/Efficient-Computing
Efficient computing methods developed by Huawei Noah's Ark Lab
ethanhe42/channel-pruning
Channel Pruning for Accelerating Very Deep Neural Networks (ICCV'17)
MingSun-Tse/Efficient-Deep-Learning
Collection of recent methods on (deep) neural network compression and acceleration.
guan-yuan/awesome-AutoML-and-Lightweight-Models
A list of high-quality (newest) AutoML works and lightweight models including 1.) Neural Architecture Search, 2.) Lightweight Structures, 3.) Model Compression, Quantization and Acceleration, 4.) Hyperparameter Optimization, 5.) Automated Feature Engineering.
lhyfst/knowledge-distillation-papers
knowledge distillation papers
alibaba/TinyNeuralNetwork
TinyNeuralNetwork is an efficient and easy-to-use deep learning model compression framework.
cnkuangshi/LightCTR
Lightweight and Scalable framework that combines mainstream algorithms of Click-Through-Rate prediction based computational DAG, philosophy of Parameter Server and Ring-AllReduce collective communication.
horseee/DeepCache
[CVPR 2024] DeepCache: Accelerating Diffusion Models for Free
SqueezeAILab/SqueezeLLM
[ICML 2024] SqueezeLLM: Dense-and-Sparse Quantization
SforAiDl/KD_Lib
A Pytorch Knowledge Distillation library for benchmarking and extending works in the domains of Knowledge Distillation, Pruning, and Quantization.
he-y/filter-pruning-geometric-median
Filter Pruning via Geometric Median for Deep Convolutional Neural Networks Acceleration (CVPR 2019 Oral)
iamhankai/ghostnet.pytorch
[CVPR2020] GhostNet: More Features from Cheap Operations
microsoft/archai
Accelerate your Neural Architecture Search (NAS) through fast, reproducible and modular research.
cedrickchee/awesome-ml-model-compression
Awesome machine learning model compression research papers, tools, and learning material.
mit-han-lab/amc
[ECCV 2018] AMC: AutoML for Model Compression and Acceleration on Mobile Devices
Zhen-Dong/HAWQ
Quantization library for PyTorch. Support low-precision and mixed-precision quantization, with hardware implementation through TVM.
chester256/Model-Compression-Papers
Papers for deep neural network compression and acceleration
he-y/soft-filter-pruning
Soft Filter Pruning for Accelerating Deep Convolutional Neural Networks
1duo/awesome-ai-infrastructures
Infrastructures™ for Machine Learning Training/Inference in Production.
pratyushasharma/laser
The Truth Is In There: Improving Reasoning in Language Models with Layer-Selective Rank Reduction
Zhen-Dong/Awesome-Quantization-Papers
List of papers related to neural network quantization in recent AI conferences and journals.
JetRunner/BERT-of-Theseus
⛵️The official PyTorch implementation for "BERT-of-Theseus: Compressing BERT by Progressive Module Replacing" (EMNLP 2020).