tmp
Opened this issue · 1 comments
Experimental report about q-extract-torch
and q-implant
.
At first, q-extract-torch
is intended to support q-implant
to produce q-param.json
and *.npy
files which are used in q-implant
.
However it has major limitations and maybe relied on inaccurate assumption, so we just left our approaches.
We developed/tested q-extract-torch
on torchvision mobilenet_v2 and resnet18/50, so maybe it only works on CNN based models.
Also we only handle rank 4 as draft stage, it would not work properly (Permutation would not work.)
Below is our approaches and what we learned/thought.
-
Names are changed due to different naming conventions on model conversion step. (PyTorch→Onnx→Tensorflow→Tensorflow Lite→Circle).
So name mapping between PyTorch↔Circle required. -
Most of Tensor values are kept during model conversion step.
So if we find corresponding Tensor value, we can map between PyTorch and Circle. -
At this point, we tried to map
PyTorch
andCircle
based on Tensor value. -
However during model conversion step(such as PyTorch→Onnx), some values are changed due to optimization.
Mapping based on Tensor value have to solve difference between Original Tensor Value and Optimized Tensor value, so we have assigned/marked unique integer value.
(Even the value is set to integer, data type should be kept in float. Because there are operators supported float only)
When we track the marked value(even the little error occured on tensor, we can track it by rounding it. Because we have marked all tensors values to integer values), we can map PyTorch Tensor and Cirlce Tensor. -
When exporting
PyTorch
toOnnx
, not existed value is added to onnx model. For example, even the Conv2d operator on PyTorch doesn’t have bias, correspoing Conv2d Onnx Operator have bias filled with zeros. So multiple tensors filled with zeros could exist. -
When the operator changed during model conversion, some of them should be handled by
q-extract-torch
.
For example, Batch Normalization is supported on PyTorch, Onnx and Tensorflow, but not supported by Tensorflow Lite. Tensorflow resolves Batch Normalization by split it into Mul and Add .- On Mapping Step, set PyTorch
Batch Norm
like below, can track/map PyTorchBatchNorm
↔ CircleMul
andAdd
- running_mean : 0
- running_var : 1
- eps : 0
- momentum : 0 (used as 1 - momentum, so should be 0)
- γ(weight on operator) : Correspond to Mul Operator on tflite/circle
- β(bias on operator) : Correspond to Add Operator on tflite/circle
- On Extraction, maually calculate Circle Mul/Add value from PyTorch BatchNorm. (What Tensorflow→Tflite Conversion do)
- mul : weight
- add : bias - running_mean * weight
- On Mapping Step, set PyTorch
-
Durning Model conversion, some operators could be added (like transpose),
q-implant
should handle for operators which can be added on model conversion, and forward propagate theq-param
for operators that would not change quantization parameters.(min/max range, zero point, scale) -
As I know, if the operator is not subclass of
torch.nn.Modules
, we can’t access the operator from the module. So we can’t getq-param
for those operators which are not subclass oftorch.nn.Modules
. (Operator ontorch
likeadd
,mul
,mean
and etc.)
q-implant
should handle those operators that can’t be extracted from PyTorch. -
We have some changes on
q-implant
to use extracted data from torchvision mobilenet/resnet. (What represented asq-implant
should handle)- Assumed min/max with dtype, scale and zerop of operator (Because
q-param.json
format don't have min/max parameter) - Assumed operator's min/max by using inputs' quantization parameter, and calculated zero point, scale by using dtype's range.
- For example,
Add
don't have quantization parameter because it is not subclass oftorch.nn.Modules
. Let inputs as x, y and output as z. We assumedAdd
's quantization parameter as below.
- Assumed min/max with dtype, scale and zerop of operator (Because
z_min = min(x_min, y_min, x_min + y_min)
z_max = max(x_max, y_max, x_max + y_max)
-
We think handling it well would work. (But It will possess errors.) However even it works in some parts, it would not be good solution because we have to handle those cases on one by one.(handle opcode differently)
-
Handling those on
q-implant
would not be good option, and we have to find way to solve it on extract part.
When we considered it, there are two ways. (What we thought. And there could be other/better solutions)-
- access and get quantization parameter of operators(of
torch
nottorch.nn.Modules
), map it corresponding circle name.
- access and get quantization parameter of operators(of
-
- Export both not quantized model(for mapping) and quantized model(for extraction) to
onnx
, and process them.
- Export both not quantized model(for mapping) and quantized model(for extraction) to
-
-
However, as we know. (What we know could be wrong.)
-
- Not sure about there is way to access them from
torch.nn.Modules
- Not sure about there is way to access them from
-
- Some quantized operator in
PyTorch
don't supportonnx
export.
- Some quantized operator in
-
-
We don't know better way for
q-extract-torch
for now, those are what we have tried. -
Additionally,
PyTorch
'sQuantStub
andDeQuantStub
is not exported toonnx
, so circle don't have quantize/dequantize operator when we runq-implant
.
Maybe we have to add below code to addq-implant
for (auto node : loco::input_nodes(g))
{
auto circle_node = loco::must_cast<luci::CircleNode *>(node);
auto quantize = node->graph()->nodes()->create<luci::CircleQuantize>();
quantize->name(circle_node->name() + "_Quantize");
quantize->dtype(circle_node->dtype());
quantize->rank(circle_node->rank());
for (uint32_t i = 0; i < circle_node->rank(); ++i)
quantize->dim(i).set(circle_node->dim(i).value());
quantize->shape_status(luci::ShapeStatus::VALID);
copy_quantparam(circle_node, quantize);
circle_node->quantparam(nullptr);
circle_node->dtype(loco::DataType::FLOAT32);
loco::replace(circle_node).with(quantize);
quantize->input(circle_node);
luci::add_origin(quantize, luci::get_origin(circle_node));
}
LGTM.
Thank you for your efforts.