HAQ: Hardware-Aware Automated Quantization with Mixed Precision

Introduction

This repo contains PyTorch implementation for paper HAQ: Hardware-Aware Automated Quantization with Mixed Precision (CVPR2019, oral)

@inproceedings{haq,
author = {Wang, Kuan and Liu, Zhijian and Lin, Yujun and Lin, Ji and Han, Song},
title = {HAQ: Hardware-Aware Automated Quantization With Mixed Precision},
booktitle = {IEEE Conference on Computer Vision and Pattern Recognition (CVPR)},
year = {2019}
}

Other papers related to automated model design:

AMC: AutoML for Model Compression and Acceleration on Mobile Devices (ECCV 2018)
ProxylessNAS: Direct Neural Architecture Search on Target Task and Hardware (ICLR 2019)

Dependencies

We evaluate this code with Pytorch 1.1 (cuda10) and torchvision 0.3.0, you can install pytorch with conda:

# install pytorch
conda install -y pytorch torchvision cudatoolkit=10.0 -c pytorch

And you can use the following command to set up the environment:

# install packages
bash run/setup.sh

Current code base is tested under following environment:

Python 3.7.3
PyTorch 1.1
torchvision 0.3.0
numpy 1.14
matplotlib 3.0.1
scikit-learn 0.21.0
easydict 1.8
progress 1.4
tensorboardX 1.7

Dataset

If you already have the ImageNet dataset for pytorch, you could create a link to data folder and use it:

# prepare dataset, change the path to your own
ln -s /path/to/imagenet/ data/

If you do not have the ImageNet yet, you can download the ImageNet dataset and move validation images to labeled subfolders. To do this, you can use the following script: https://raw.githubusercontent.com/soumith/imagenetloader.torch/master/valprep.sh

Reinforcement learning search

You can run the bash file as following to search the quantization strategy for specific model.

bash run/run_search.sh

Usage details

python rl_quantize.py --help

Finetune Policy

After searching, you can get the quantization strategy list, and you can replace the strategy list in finetune.py to finetune and evaluate the performance on ImageNet dataset.
We set the default quantization strategy searched under preserve ratio = 0.1 like:

# preserve ratio 10%
strategy = [6, 6, 5, 5, 5, 5, 4, 5, 5, 4, 5, 5, 5, 5, 5, 5, 3, 5, 4, 3, 5, 4, 3, 4, 4, 4, 2, 5, 4, 3, 3, 5, 3, 2, 5, 3, 2, 4, 3, 2, 5, 3, 2, 5, 3, 4, 2, 5, 2, 3, 4, 2, 3, 4]

You can follow the following bash file to finetune the quantized model to get a better performance:

bash run/run_finetune.sh

Usage details

python finetune.py --help

Evaluate

You can download the pretrained quantized model and evaluate it.

# download checkpoint
mkdir -p checkpoints/resnet50/
cd checkpoints/resnet50/
wget https://hanlab.mit.edu/files/haq/resnet50_0.1_75.48.pth.tar
cd ../..
# evaluate 
bash run/run_eval.sh

Models	preserve ratio	Top1 Acc (%)	Top5 Acc (%)
resnet50 (original)	1.0	76.15	92.87
resnet50 (compress10x)	0.1	75.48	92.42

guyjacob/haq-release