efficient-inference

There are 59 repositories under efficient-inference topic.

huawei-noah/Efficient-AI-Backbones
Efficient AI Backbones including GhostNet, TNT and MLP, developed by Huawei Noah's Ark Lab.
Language:Python3.9k 51 259690
SqueezeAILab/LLMCompiler
[ICML 2024] LLMCompiler: An LLM Compiler for Parallel Function Calling
Language:Python1.1k 18 580
snap-research/EfficientFormer
EfficientFormerV2 [ICCV 2023] & EfficientFormer [NeurIPs 2022]
Language:Python960 36 5890
huawei-noah/AdderNet
Code for paper " AdderNet: Do We Really Need Multiplications in Deep Learning?"
Language:Python948 25 72186
horseee/DeepCache
[CVPR 2024] DeepCache: Accelerating Diffusion Models for Free
Language:Python643 17 3932
SqueezeAILab/SqueezeLLM
[ICML 2024] SqueezeLLM: Dense-and-Sparse Quantization
Language:Python588 17 2437
liuzhuang13/slimming
Learning Efficient Convolutional Networks through Network Slimming, In ICCV 2017.
Language:Lua554 21 1272
VITA-Group/LightGaussian
"LightGaussian: Unbounded 3D Gaussian Compression with 15x Reduction and 200+ FPS", Zhiwen Fan, Kevin Wang, Kairun Wen, Zehao Zhu, Dejia Xu, Zhangyang Wang
Language:Python483 33 2534
Zhen-Dong/Awesome-Quantization-Papers
List of papers related to neural network quantization in recent AI conferences and journals.
332 9 137
The-Learning-And-Vision-Atelier-LAVA/SMSR
[CVPR 2021] Exploring Sparsity in Image Super-Resolution for Efficient Inference
Language:Python235 7 2829
changlin31/DS-Net
(CVPR 2021, Oral) Dynamic Slimmable Network
Language:Python225 9 1919
SqueezeAILab/KVQuant
KVQuant: Towards 10 Million Context Length LLM Inference with KV Cache Quantization
Language:Python218 14 1017
liuziwei7/mobile-id
Deep Face Model Compression
Language:MATLAB195 17 8102
xindongzhang/ELAN
[ECCV2022] Efficient Long-Range Attention Network for Image Super-resolution
Language:Python192 8 1817
cure-lab/DeciWatch
[ECCV 2022] Official implementation of the paper "DeciWatch: A Simple Baseline for 10x Efficient 2D and 3D Pose Estimation"
Language:Python171 9 2116
lucidrains/speculative-decoding
Explorations into some recent techniques surrounding speculative decoding
Language:Python164 8 114
Picovoice/picollm
On-device LLM Inference Powered by X-Bit Quantization
Language:Python100 6 02
RAIVNLab/STR
Soft Threshold Weight Reparameterization for Learnable Sparsity
Language:Python84 7 811
kssteven418/BigLittleDecoder
[NeurIPS'23] Speculative Decoding with Big Little Decoder
Language:Python82 6 49
snap-research/graphless-neural-networks
[ICLR 2022] Code for Graph-less Neural Networks: Teaching Old MLPs New Tricks via Distillation (GLNN)
Language:Python80 8 1020
FranxYao/Partially-Observed-TreeCRFs
Implementation of AAAI 21 paper: Nested Named Entity Recognition with Partially Observed TreeCRFs
Language:Python52 4 97
IBM/AdaMML
Official implementation of AdaMML. https://arxiv.org/abs/2105.05165.
Language:Python50 6 69
tchittesh/lzu
Code for Learning to Zoom and Unzoom (CVPR 2023)
Language:Python46 2 14
raymin0223/fast_robust_early_exit
Fast and Robust Early-Exiting Framework for Autoregressive Language Models with Synchronized Parallel Decoding (EMNLP 2023 Long)
Language:Python43 2 58
ivclab/agegenderLMTCNN
Jia-Hong Lee, Yi-Ming Chan, Ting-Yen Chen, and Chu-Song Chen, "Joint Estimation of Age and Gender from Unconstrained Face Images using Lightweight Multi-task CNN for Mobile Applications," IEEE International Conference on Multimedia Information Processing and Retrieval, MIPR 2018
Language:Python40 7 93
yikaiw/RS-Nets
[ECCV 2020] Code release for "Resolution Switchable Networks for Runtime Efficient Image Recognition"
Language:Python39 5 28
bharathsudharsan/TinyML-Benchmark-NNs-on-MCUs
Code for WF-IoT paper 'TinyML Benchmark: Executing Fully Connected Neural Networks on Commodity Microcontrollers'
Language:Python30 3 311
linksense/EfficientNet.PyTorch
Concise, Modular, Human-friendly PyTorch implementation of EfficientNet with Pre-trained Weights.
Language:Python30 4 15
Zhen-Dong/CoDeNet
[FPGA'21] CoDeNet is an efficient object detection model on PyTorch, with SOTA performance on VOC and COCO based on CenterNet and Co-Designed deformable convolution.
Language:Python26 3 36
bharathsudharsan/CNN_on_MCU
Code for paper 'Multi-Component Optimization and Efficient Deployment of Neural-Networks on Resource-Constrained IoT Hardware'
Language:Jupyter Notebook24 4 019
VITA-Group/triple-wins
[ICLR 2020] ”Triple Wins: Boosting Accuracy, Robustness and Efficiency Together by Enabling Input-Adaptive Inference“
Language:Python24 13 17
ivclab/NeuralMerger
Yi-Min Chou, Yi-Ming Chan, Jia-Hong Lee, Chih-Yi Chiu, Chu-Song Chen, "Unifying and Merging Well-trained Deep Neural Networks for Inference Stage," International Joint Conference on Artificial Intelligence (IJCAI), 2018
Language:Python20 6 13
xternalz/SDPoint
Stochastic Downsampling for Cost-Adjustable Inference and Improved Regularization in Convolutional Networks
Language:Python18 3 14
ivclab/Multistage_Pruning
Cheng-Hao Tu, Jia-Hong Lee, Yi-Ming Chan and Chu-Song Chen, "Pruning Depthwise Separable Convolutions for MobileNet Compression," International Joint Conference on Neural Networks, IJCNN 2020, July 2020.
Language:Python16 6 03
snap-research/linkless-link-prediction
[ICML 2023] Linkless Link Prediction via Relational Distillation
Language:Python15 6 16
IBM/AutoVP
[ICLR24] AutoVP: An Automated Visual Prompting Framework and Benchmark
Language:Python14 6 01

efficient-inference

huawei-noah/Efficient-AI-Backbones

SqueezeAILab/LLMCompiler

snap-research/EfficientFormer

huawei-noah/AdderNet

horseee/DeepCache

SqueezeAILab/SqueezeLLM

liuzhuang13/slimming

VITA-Group/LightGaussian

Zhen-Dong/Awesome-Quantization-Papers

The-Learning-And-Vision-Atelier-LAVA/SMSR

changlin31/DS-Net

SqueezeAILab/KVQuant

liuziwei7/mobile-id

xindongzhang/ELAN

cure-lab/DeciWatch

lucidrains/speculative-decoding

Picovoice/picollm

RAIVNLab/STR

kssteven418/BigLittleDecoder

snap-research/graphless-neural-networks

FranxYao/Partially-Observed-TreeCRFs

IBM/AdaMML

tchittesh/lzu

raymin0223/fast_robust_early_exit

ivclab/agegenderLMTCNN

yikaiw/RS-Nets

bharathsudharsan/TinyML-Benchmark-NNs-on-MCUs

linksense/EfficientNet.PyTorch

Zhen-Dong/CoDeNet

bharathsudharsan/CNN_on_MCU

VITA-Group/triple-wins

ivclab/NeuralMerger

xternalz/SDPoint

ivclab/Multistage_Pruning

snap-research/linkless-link-prediction

IBM/AutoVP