/int8_experiments

Int8 quantization in Openvino

Primary LanguageC++

Using OpenVino to run a Pytorch model on Intel CPU

First train your model:

cd $int8_experiment/pytorch
python main.py -f config.yaml -t LENET
========================================================
Configuration:
========================================================
using pytorch: 1.0.1.post2
dataset: mnist
lr: 0.001
batchsize: 100
num_epochs: 100
model_type: lenet
init_type: glorot
quantization: normal
operation_mode: normal
experiment_name: mnist
trained_model: ./mnist.pkl

[0] Test Accuracy of the model on the 10000 test images: 96.63 , lr:0.001    , loss:0.206096425
[1] Test Accuracy of the model on the 10000 test images: 97.3  , lr:0.00099  , loss:0.014005098
[2] Test Accuracy of the model on the 10000 test images: 97.96 , lr:0.0009801, loss:0.097614221
...

Run Openvino optimzer and inference engine:

cd $int8_experiment/openvino_py
python openvino_mnist.py -f config.yaml -t LENET
========================================================
Configuration:
========================================================
using pytorch: 1.0.1.post2
dataset: mnist
lr: 0.001
batchsize: 100
num_epochs: 1
model_type: lenet
init_type: glorot
quantization: normal
operation_mode: normal
experiment_name: mnist
trained_model: ./mnist.pkl

[INFO   ]   =======================================================================
[INFO   ]   exporting ./mnist.pkl to ONNX
[INFO   ]   =======================================================================
mnist.onnx exported!
[INFO   ]   =======================================================================
[INFO   ]   Running OpenVino optimizer on mnist.onnx
[INFO   ]   =======================================================================
Model Optimizer arguments:
Common parameters:
    - Path to the Input Model:  /home/mhossein/myRepos/int8_experiment/openvino_py/mnist.onnx
    - Path for generated IR:    /home/mhossein/myRepos/int8_experiment/openvino_py/.
    - IR output name:   mnist
    - Log level:    ERROR
    - Batch:    Not specified, inherited from the model
    - Input layers:     Not specified, inherited from the model
    - Output layers:    Not specified, inherited from the model
    - Input shapes:     Not specified, inherited from the model
    - Mean values:  Not specified
    - Scale values:     Not specified
    - Scale factor:     Not specified
    - Precision of IR:  FP32
    - Enable fusing:    True
    - Enable grouped convolutions fusing:   True
    - Move mean values to preprocess section:   False
    - Reverse input channels:   False
ONNX specific parameters:
Model Optimizer version:    1.5.12.49d067a0

[ SUCCESS ] Generated IR model.
[ SUCCESS ] XML file: /home/mhossein/myRepos/int8_experiment/openvino_py/./mnist.xml
[ SUCCESS ] BIN file: /home/mhossein/myRepos/int8_experiment/openvino_py/./mnist.bin
[ SUCCESS ] Total execution time: 0.42 seconds. 
[INFO   ]   =======================================================================
[INFO   ]   Running Openvino Inference on 10000 images
[INFO   ]   =======================================================================
name                                                                   layer_type      exet_type       status          real_time, us
28                                                                     Convolution     jit_avx2_FP32   EXECUTED        351       
30                                                                     Pooling         jit_avx_FP32    EXECUTED        19        
31                                                                     Convolution     ref_any_FP32    EXECUTED        5828      
33                                                                     Pooling         jit_avx_FP32    EXECUTED        742       
34                                                                     Convolution     jit_avx2_FP32   EXECUTED        167       
34_nChw8c_nchw_43                                                      Reorder         reorder_FP32    EXECUTED        21        
43                                                                     Reshape         unknown_FP32    NOT_RUN         0         
44                                                                     FullyConnected  jit_gemm_FP32   EXECUTED        207       
45                                                                     FullyConnected  jit_gemm_FP32   EXECUTED        18        
46                                                                     FullyConnected  jit_gemm_FP32   EXECUTED        8         
out_46                                                                 Output          unknown_FP32    NOT_RUN         0         
accuracy = 0.8979

Quantize model for int8:

~/inference_engine_samples_build/intel64/Release/calibration_tool -t C -d CPU -i ./mnist_dataset/mnist_data -m mnist.xml -threshold 10
[ INFO ] InferenceEngine: 
    API version ............ 1.4
    Build .................. 19154
[ INFO ] Parsing input parameters
[ INFO ] Loading plugin

    API version ............ 1.5
    Build .................. lnx_20181004
    Description ....... MKLDNNPlugin
[ INFO ] Loading network files
[ INFO ] Preparing input blobs
[ INFO ] Batch size is 1
[ INFO ] Collecting accuracy metric in FP32 mode to get a baseline, collecting activation statistics
Progress: [....................] 100.00% done
  FP32 Accuracy: 20.17% 
[ INFO ] Verification of network accuracy if all possible layers converted to INT8
Validate int8 accuracy, threshold for activation statistics = 100.00
Progress: [....................] 100.00% done
   Accuracy is 12.50%
Validate int8 accuracy, threshold for activation statistics = 99.50
Progress: [....................] 100.00% done
   Accuracy is 12.83%
Validate int8 accuracy, threshold for activation statistics = 99.00
Progress: [....................] 100.00% done
   Accuracy is 13.33%
Validate int8 accuracy, threshold for activation statistics = 98.50
Progress: [....................] 100.00% done
   Accuracy is 12.00%
Validate int8 accuracy, threshold for activation statistics = 98.00
Progress: [....................] 100.00% done
   Accuracy is 12.67%
Validate int8 accuracy, threshold for activation statistics = 97.50
Progress: [....................] 100.00% done
   Accuracy is 13.17%
Validate int8 accuracy, threshold for activation statistics = 97.00
Progress: [....................] 100.00% done
   Accuracy is 12.50%
Validate int8 accuracy, threshold for activation statistics = 96.50
Progress: [....................] 100.00% done
   Accuracy is 12.17%
Validate int8 accuracy, threshold for activation statistics = 96.00
Progress: [....................] 100.00% done
   Accuracy is 12.50%
Validate int8 accuracy, threshold for activation statistics = 95.50
Progress: [....................] 100.00% done
   Accuracy is 12.67%
[ INFO ] Achieved required accuracy drop satisfying threshold
FP32 accuracy: 20.17% vs current Int8 configuration accuracy: 13.33% with threshold for activation statistic: 99.00%
Layers profile for Int8 quantization
28: I8
31: I8
34: I8