post-training-quantization

There are 40 repositories under post-training-quantization topic.

intel/neural-compressor
SOTA low-bit LLM quantization (INT8/FP8/INT4/FP4/NF4) & sparsity; leading model compression techniques on TensorFlow, PyTorch, and ONNX Runtime
Language:Python2.3k 33 211259
666DZY666/micronet
micronet, a model compression and deploy lib. compression: 1、quantization: quantization-aware-training(QAT), High-Bit(>2b)(DoReFa/Quantization and Training of Neural Networks for Efficient Integer-Arithmetic-Only Inference)、Low-Bit(≤2b)/Ternary and Binary(TWN/BNN/XNOR-Net); post-training-quantization(PTQ), 8-bit(tensorrt); 2、 pruning: normal、regular and group convolutional channel pruning; 3、 group convolution structure; 4、batch-normalization fuse for quantization. deploy: tensorrt, fp32/fp16/int8(ptq-calibration)、op-adapt(upsample)、dynamic_shape
Language:Python2.2k 41 110478
alibaba/TinyNeuralNetwork
TinyNeuralNetwork is an efficient and easy-to-use deep learning model compression framework.
Language:Python774 21 148117
SqueezeAILab/SqueezeLLM
[ICML 2024] SqueezeLLM: Dense-and-Sparse Quantization
Language:Python667 18 2843
ModelTC/llmc
[EMNLP 2024 Industry Track] This is the official PyTorch implementation of "LLMC: Benchmarking Large Language Model Quantization with a Versatile Compression Toolkit".
Language:Python375 10 4743
Xiuyu-Li/q-diffusion
[ICCV 2023] Q-Diffusion: Quantizing Diffusion Models.
Language:Python338 17 4122
megvii-research/Sparsebit
A model compression and acceleration toolbox based on pytorch.
Language:Python328 12 3040
megvii-research/FQ-ViT
[IJCAI 2022] FQ-ViT: Post-Training Quantization for Fully Quantized Vision Transformer
Language:Python315 5 4648
sayakpaul/Adventures-in-TensorFlow-Lite
This repository contains notebooks that show the usage of TensorFlow Lite for quantizing deep neural networks.
Language:Jupyter Notebook172 12 634
Hsu1023/DuQuant
[NeurIPS 2024 Oral🔥] DuQuant: Distributing Outliers via Dual Transformation Makes Stronger Quantized LLMs.
Language:Python136 2 169
hkproj/quantization-notes
Notes on quantization in neural networks
Language:Jupyter Notebook62 2 115
ModelTC/TFMQ-DM
[CVPR 2024 Highlight] This is the official PyTorch implementation of "TFMQ-DM: Temporal Feature Maintenance Quantization for Diffusion Models".
Language:Jupyter Notebook59 8 114
Sanjana7395/static_quantization
Post-training static quantization using ResNet18 architecture
Language:Jupyter Notebook37 0 17
ModelTC/QLLM
[ICLR 2024] This is the official PyTorch implementation of "QLLM: Accurate and Efficient Low-Bitwidth Quantization for Large Language Models"
Language:Python34 8 13
zysxmu/FDDA
Pytorch implementation of our paper accepted by ECCV 2022-- Fine-grained Data Distribution Alignment for Post-Training Quantization
Language:Python14 1 01
KwangHoonAn/Quantizations
Language:Python13 2 05
iszry/DI2N-PTQ4DM
Improved the performance of 8-bit PTQ4DM expecially on FID.
Language:Python9 1 00
shieldforever/NeuronQuant
[ASP-DAC 2025] "NeuronQuant: Accurate and Efficient Post-Training Quantization for Spiking Neural Networks" Official Implementation
Language:Python8 2 01
GongCheng1919/bias-compensation
[CAAI AIR'24] Minimize Quantization Output Error with Bias Compensation
Language:Python7 1 21
motokimura/pytorch_quantization_fx
An example to quantize MobileNetV2 trained on CIFAR-10 dataset with PyTorch FX graph mode quantization
Language:Python6 2 03
Rumeysakeskin/ASR-Quantization
Post-training quantization on Nvidia Nemo ASR model
Language:Jupyter Notebook6 1 0
likholat/openvino_quantization
This sample shows how to convert TensorFlow model to OpenVINO IR model and how to quantize OpenVINO model.
Language:Python5 3 50
yester31/Quantization_EX
quantization example for pqt & qat
Language:Python5 1 02
satya15july/quantization
Model Quantization with Pytorch, Tensorflow & Larq
Language:C++4 2 01
smpanaro/norm-tweaking
Post post-training-quantization (PTQ) method for improving LLMs. Unofficial implementation of https://arxiv.org/abs/2309.02784
Language:Python4 3 10
ssi-research/eptq
Implementation of EPTQ - an Enhanced Post-Training Quantization algorithm for DNN compression
Language:Python4 3 00
yester31/TensorRT_ONNX
Generating tensorrt model using onnx
Language:C++4 1 01
AndreiZoltan/ptq_resnet20
Low-bit (2/4/8/16) Post Training Quantization for ResNet20
Language:Python2 1 00
Gaurav-Van/Fine-Tuning-LLMs
Introductory Guide where we will talk about Different Techniques of Fine Tuning LLMs
Language:Jupyter Notebook2 1 01
generalMG/Medical-Dataset-Deep-Learning-Quantization-Data-Analysis
The repository discusses a research work published on MDPI Sensors and provides details about the project.
Language:Python2 1 00
TanyaChutani/Quantization_Tensorflow
Quantization for Object Detection in Tensorflow 2.x
Language:Python2 2 0
yashmaniya0/Quantization-of-Image-Classification-Models
Comprehensive study on the quantization of various CNN models, employing techniques such as Post-Training Quantization and Quantization Aware Training (QAT).
Language:Jupyter Notebook2 1 00
OmidGhadami95/EfficientNetV2_Quantization_CK
EfficientNetV2 (Efficientnetv2-b2) and quantization int8 and fp32 (QAT and PTQ) on CK+ dataset . fine-tuning, augmentation, solving imbalanced dataset, etc.
Language:Jupyter Notebook1 1 00
yester31/TensorRT_Examples
All useful sample codes of tensorrt models using onnx
Language:Python1 2 01
Inpyo-Hong/Model-Compression-Paper-List
Model Compression Paper List (Focusing on Quantization, Particularly Zero-Shot Quantization)
0 1 00
rioter1/embeddedAI
Creating Snapdragon neural processing engine environment to convert protobuf(pb) files to dlc(deep learning container) files as well as use snpe for Quantizing neural networks. Mobile app development link https://github.com/anshumax/mobilenn
Language:Python0 2 00

post-training-quantization

intel/neural-compressor

666DZY666/micronet

alibaba/TinyNeuralNetwork

SqueezeAILab/SqueezeLLM

ModelTC/llmc

Xiuyu-Li/q-diffusion

megvii-research/Sparsebit

megvii-research/FQ-ViT

sayakpaul/Adventures-in-TensorFlow-Lite

Hsu1023/DuQuant

hkproj/quantization-notes

ModelTC/TFMQ-DM

Sanjana7395/static_quantization

ModelTC/QLLM

zysxmu/FDDA

KwangHoonAn/Quantizations

iszry/DI2N-PTQ4DM

shieldforever/NeuronQuant

GongCheng1919/bias-compensation

motokimura/pytorch_quantization_fx

Rumeysakeskin/ASR-Quantization

likholat/openvino_quantization

yester31/Quantization_EX

satya15july/quantization

smpanaro/norm-tweaking

ssi-research/eptq

yester31/TensorRT_ONNX

AndreiZoltan/ptq_resnet20

Gaurav-Van/Fine-Tuning-LLMs

generalMG/Medical-Dataset-Deep-Learning-Quantization-Data-Analysis

TanyaChutani/Quantization_Tensorflow

yashmaniya0/Quantization-of-Image-Classification-Models

OmidGhadami95/EfficientNetV2_Quantization_CK

yester31/TensorRT_Examples

Inpyo-Hong/Model-Compression-Paper-List

rioter1/embeddedAI