With this repository, you can try model quantization of MobileNetV2 trained on CIFAR10 dataset. Currently, post training static quantization and quantization aware training are suppored.
model | quantization method | CIFAR10 val accuracy [%] | model size [MB] |
---|---|---|---|
MobileNetV2 (float) | - | 96.36 | 14 |
MobileNetV2 (int8) | post training static quantization | 95.53 | 3.8 |
MobileNetV2 (int8) | quantization aware training | 96.30 | 3.8 |
- Ubuntu OS
- CUDA (tested with 11.6)
- Python3 (test with 3.8.8)
See requirements.txt for additional requirements.
May work with other versions, but note that torch>=1.3.0 is required to use PyTorch quantization library.
$ pip install -r requirements.txt
Before training, sign up for W&B
and create a new project named pytorch_model_quantization
.
Get your API key from W&B > Settings
> API keys
and then:
$ echo 'WANDB_API_KEY = "xxxx"' > .env # replace xxxx with your own W&B API key
train.py
will load the API key from .env
to send training logs to W&B.
Pretrained weights are available:
unzip models_v2.zip
models/exp_2000/model_best.pth
: float modelmodels/exp_2001/model_best.pth
: model trained with qnantization aware training
You need to train float model first (can be skipped if you use pretrained weight):
$ EXP_ID=2000
$ python train.py $EXP_ID --mode normal --lr 0.005 --batch_size 64
Trained weight is saved into models/exp_2000/best_model.pth
.
To evaluate this model:
$ python test.py $EXP_ID --mode normal
You can apply post training static quantization to this float model:
$ python test.py $EXP_ID --mode ptq --replace_relu --fuse_model
To compare the model size:
$ ls -lh models/exp_2000/scripted_*
...
-rw-r--r-- 1 kimura kimura 14M May 27 04:22 scripted_model_normal.pth # floating
-rw-r--r-- 1 kimura kimura 3.8M May 27 04:37 scripted_model_ptq-relu-fused.pth # quantized (post training static quantization)
...
For quantization aware training (can be skipped if you use pretrained weight):
$ EXP_ID=2001
$ python train.py $EXP_ID --mode qat --replace_relu --fuse_model --lr 0.005 --batch_size 64
Trained weight is saved into models/exp_2001/best_model.pth
.
To evaluate this model:
$ python test.py $EXP_ID --mode qat --replace_relu --fuse_model
To check the model size:
$ ls -lh models/exp_2001/scripted_*
...
-rw-r--r-- 1 kimura kimura 3.8M May 27 07:52 scripted_model_ptq-relu-fused.pth # quantized (quantization aware training)
...
- Add a table to show model accuracy and performance
- Add more options for QAT (observer, etc.)
- Add models
- Finish docstring