model-acceleration

There are 16 repositories under model-acceleration topic.

  • he-y/Awesome-Pruning

    A curated list of neural network pruning resources.

  • htqin/awesome-model-quantization

    A list of papers, docs, codes about model quantization. This repo is aimed to provide the info for model quantization research, we are continuously improving the project. Welcome to PR the works (papers, repositories) that are missed by the repo.

  • guan-yuan/awesome-AutoML-and-Lightweight-Models

    A list of high-quality (newest) AutoML works and lightweight models including 1.) Neural Architecture Search, 2.) Lightweight Structures, 3.) Model Compression, Quantization and Acceleration, 4.) Hyperparameter Optimization, 5.) Automated Feature Engineering.

  • chester256/Model-Compression-Papers

    Papers for deep neural network compression and acceleration

  • musco-ai/musco-pytorch

    MUSCO: MUlti-Stage COmpression of neural networks

    Language:Jupyter Notebook739916
  • wangxb96/Awesome-AI-on-the-Edge

    Resources of our survey paper "Enabling AI on Edges: Techniques, Applications and Challenges"

  • sdc17/CrossGET

    [ICML 2024] CrossGET: Cross-Guided Ensemble of Tokens for Accelerating Vision-Language Transformers.

  • Lee-Gihun/MicroNet_OSI-AI

    (NeurIPS-2019 MicroNet Challenge - 3rd Winner) Open source code for "SIPA: A simple framework for efficient networks"

    Language:Python18346
  • signalogic/SigDL

    Deep Learning Compression and Acceleration SDK -- deep model compression for Edge and IoT embedded systems, and deep model acceleration for clouds and private servers

  • cantbebetter2/Awesome-Diffusion-Distillation

    A list of papers, docs, codes about diffusion distillation.This repo collects various distillation methods for the Diffusion model. Welcome to PR the works (papers, repositories) missed by the repo.

  • MingSun-Tse/Caffe_IncReg

    [IJCNN'19, IEEE JSTSP'19] Caffe code for our paper "Structured Pruning for Efficient ConvNets via Incremental Regularization"; [BMVC'18] "Structured Probabilistic Pruning for Convolutional Neural Network Acceleration"

    Language:Makefile14425
  • TaehyeonKim-pyomu/CNN_compression_rank_selection_BayesOpt

    Bayesian Optimization-Based Global Optimal Rank Selection for Compression of Convolutional Neural Networks, IEEE Access

    Language:Python13224
  • bnabis93/vision-language-examples

    Vision-lanugage model example code.

    Language:Python7110
  • likholat/openvino_quantization

    This sample shows how to convert TensorFlow model to OpenVINO IR model and how to quantize OpenVINO model.

    Language:Python5350
  • dhingratul/Model-Compression

    Reduce the model complexity by 612 times, and memory footprint by 19.5 times compared to base model, while achieving worst case accuracy threshold.

    Language:Jupyter Notebook4402
  • ksm26/Efficiently-Serving-LLMs

    Learn the ins and outs of efficiently serving Large Language Models (LLMs). Dive into optimization techniques, including KV caching and Low Rank Adapters (LoRA), and gain hands-on experience with Predibase’s LoRAX framework inference server.

    Language:Jupyter Notebook210