ONEforALL-S003/TWO

tmp

Opened this issue · 1 comments

Experimental report about q-extract-torch and q-implant.
At first, q-extract-torch is intended to support q-implant to produce q-param.json and *.npy files which are used in q-implant.
However it has major limitations and maybe relied on inaccurate assumption, so we just left our approaches.

We developed/tested q-extract-torch on torchvision mobilenet_v2 and resnet18/50, so maybe it only works on CNN based models.
Also we only handle rank 4 as draft stage, it would not work properly (Permutation would not work.)

Below is our approaches and what we learned/thought.

  • Names are changed due to different naming conventions on model conversion step. (PyTorch→Onnx→Tensorflow→Tensorflow Lite→Circle).
    So name mapping between PyTorch↔Circle required.

  • Most of Tensor values are kept during model conversion step.
    So if we find corresponding Tensor value, we can map between PyTorch and Circle.

  • At this point, we tried to map PyTorch and Circle based on Tensor value.

  • However during model conversion step(such as PyTorch→Onnx), some values are changed due to optimization.
    Mapping based on Tensor value have to solve difference between Original Tensor Value and Optimized Tensor value, so we have assigned/marked unique integer value.
    (Even the value is set to integer, data type should be kept in float. Because there are operators supported float only)
    When we track the marked value(even the little error occured on tensor, we can track it by rounding it. Because we have marked all tensors values to integer values), we can map PyTorch Tensor and Cirlce Tensor.

  • When exporting PyTorch to Onnx, not existed value is added to onnx model. For example, even the Conv2d operator on PyTorch doesn’t have bias, correspoing Conv2d Onnx Operator have bias filled with zeros. So multiple tensors filled with zeros could exist.

  • When the operator changed during model conversion, some of them should be handled by q-extract-torch.
    For example, Batch Normalization is supported on PyTorch, Onnx and Tensorflow, but not supported by Tensorflow Lite. Tensorflow resolves Batch Normalization by split it into Mul and Add .

    • On Mapping Step, set PyTorch Batch Norm like below, can track/map PyTorch BatchNorm ↔ Circle Mul and Add
      • running_mean : 0
      • running_var : 1
      • eps : 0
      • momentum : 0 (used as 1 - momentum, so should be 0)
      • γ(weight on operator) : Correspond to Mul Operator on tflite/circle
      • β(bias on operator) : Correspond to Add Operator on tflite/circle
    • On Extraction, maually calculate Circle Mul/Add value from PyTorch BatchNorm. (What Tensorflow→Tflite Conversion do)
      • mul : weight
      • add : bias - running_mean * weight
  • Durning Model conversion, some operators could be added (like transpose), q-implant should handle for operators which can be added on model conversion, and forward propagate the q-param for operators that would not change quantization parameters.(min/max range, zero point, scale)

  • As I know, if the operator is not subclass of torch.nn.Modules, we can’t access the operator from the module. So we can’t get q-param for those operators which are not subclass of torch.nn.Modules. (Operator on torch like add, mul, mean and etc.)
    q-implant should handle those operators that can’t be extracted from PyTorch.

  • We have some changes on q-implant to use extracted data from torchvision mobilenet/resnet. (What represented as q-implant should handle)

    • Assumed min/max with dtype, scale and zerop of operator (Because q-param.json format don't have min/max parameter)
    • Assumed operator's min/max by using inputs' quantization parameter, and calculated zero point, scale by using dtype's range.
    • For example, Add don't have quantization parameter because it is not subclass of torch.nn.Modules. Let inputs as x, y and output as z. We assumed Add's quantization parameter as below.
z_min = min(x_min, y_min, x_min + y_min)
z_max = max(x_max, y_max, x_max + y_max)
  • We think handling it well would work. (But It will possess errors.) However even it works in some parts, it would not be good solution because we have to handle those cases on one by one.(handle opcode differently)

  • Handling those on q-implant would not be good option, and we have to find way to solve it on extract part.
    When we considered it, there are two ways. (What we thought. And there could be other/better solutions)

      1. access and get quantization parameter of operators(of torch not torch.nn.Modules), map it corresponding circle name.
      1. Export both not quantized model(for mapping) and quantized model(for extraction) to onnx, and process them.
  • However, as we know. (What we know could be wrong.)

      1. Not sure about there is way to access them from torch.nn.Modules
      1. Some quantized operator in PyTorch don't support onnx export.
  • We don't know better way for q-extract-torch for now, those are what we have tried.

  • Additionally, PyTorch's QuantStub and DeQuantStub is not exported to onnx, so circle don't have quantize/dequantize operator when we run q-implant.
    Maybe we have to add below code to add q-implant

  for (auto node : loco::input_nodes(g))
  {
    auto circle_node = loco::must_cast<luci::CircleNode *>(node);
    auto quantize = node->graph()->nodes()->create<luci::CircleQuantize>();
    quantize->name(circle_node->name() + "_Quantize");
    quantize->dtype(circle_node->dtype());
    quantize->rank(circle_node->rank());
    for (uint32_t i = 0; i < circle_node->rank(); ++i)
      quantize->dim(i).set(circle_node->dim(i).value());

    quantize->shape_status(luci::ShapeStatus::VALID);

    copy_quantparam(circle_node, quantize);
    circle_node->quantparam(nullptr);
    circle_node->dtype(loco::DataType::FLOAT32);

    loco::replace(circle_node).with(quantize);
    quantize->input(circle_node);
    luci::add_origin(quantize, luci::get_origin(circle_node));
  }

LGTM.
Thank you for your efforts.