model-acceleration

There are 16 repositories under model-acceleration topic.

he-y/Awesome-Pruning
A curated list of neural network pruning resources.
2.2k 87 27327
htqin/awesome-model-quantization
A list of papers, docs, codes about model quantization. This repo is aimed to provide the info for model quantization research, we are continuously improving the project. Welcome to PR the works (papers, repositories) that are missed by the repo.
1.7k 55 12200
guan-yuan/awesome-AutoML-and-Lightweight-Models
A list of high-quality (newest) AutoML works and lightweight models including 1.) Neural Architecture Search, 2.) Lightweight Structures, 3.) Model Compression, Quantization and Acceleration, 4.) Hyperparameter Optimization, 5.) Automated Feature Engineering.
827 55 0160
chester256/Model-Compression-Papers
Papers for deep neural network compression and acceleration
393 20 077
musco-ai/musco-pytorch
MUSCO: MUlti-Stage COmpression of neural networks
Language:Jupyter Notebook73 9 916
wangxb96/Awesome-AI-on-the-Edge
Resources of our survey paper "Enabling AI on Edges: Techniques, Applications and Challenges"
40 2 02
sdc17/CrossGET
[ICML 2024] CrossGET: Cross-Guided Ensemble of Tokens for Accelerating Vision-Language Transformers.
21 6 10
Lee-Gihun/MicroNet_OSI-AI
(NeurIPS-2019 MicroNet Challenge - 3rd Winner) Open source code for "SIPA: A simple framework for efficient networks"
Language:Python18 3 46
signalogic/SigDL
Deep Learning Compression and Acceleration SDK -- deep model compression for Edge and IoT embedded systems, and deep model acceleration for clouds and private servers
18 2 010
cantbebetter2/Awesome-Diffusion-Distillation
A list of papers, docs, codes about diffusion distillation.This repo collects various distillation methods for the Diffusion model. Welcome to PR the works (papers, repositories) missed by the repo.
15 1 0
MingSun-Tse/Caffe_IncReg
[IJCNN'19, IEEE JSTSP'19] Caffe code for our paper "Structured Pruning for Efficient ConvNets via Incremental Regularization"; [BMVC'18] "Structured Probabilistic Pruning for Convolutional Neural Network Acceleration"
Language:Makefile14 4 25
TaehyeonKim-pyomu/CNN_compression_rank_selection_BayesOpt
Bayesian Optimization-Based Global Optimal Rank Selection for Compression of Convolutional Neural Networks, IEEE Access
Language:Python13 2 24
bnabis93/vision-language-examples
Vision-lanugage model example code.
Language:Python7 1 10
likholat/openvino_quantization
This sample shows how to convert TensorFlow model to OpenVINO IR model and how to quantize OpenVINO model.
Language:Python5 3 50
dhingratul/Model-Compression
Reduce the model complexity by 612 times, and memory footprint by 19.5 times compared to base model, while achieving worst case accuracy threshold.
Language:Jupyter Notebook4 4 02
ksm26/Efficiently-Serving-LLMs
Learn the ins and outs of efficiently serving Large Language Models (LLMs). Dive into optimization techniques, including KV caching and Low Rank Adapters (LoRA), and gain hands-on experience with Predibase’s LoRAX framework inference server.
Language:Jupyter Notebook2 1 0

model-acceleration

he-y/Awesome-Pruning

htqin/awesome-model-quantization

guan-yuan/awesome-AutoML-and-Lightweight-Models

chester256/Model-Compression-Papers

musco-ai/musco-pytorch

wangxb96/Awesome-AI-on-the-Edge

sdc17/CrossGET

Lee-Gihun/MicroNet_OSI-AI

signalogic/SigDL

cantbebetter2/Awesome-Diffusion-Distillation

MingSun-Tse/Caffe_IncReg

TaehyeonKim-pyomu/CNN_compression_rank_selection_BayesOpt

bnabis93/vision-language-examples

likholat/openvino_quantization

dhingratul/Model-Compression

ksm26/Efficiently-Serving-LLMs