Question about channel_transpose in common_functions.py
Hyunseok-Kim0 opened this issue ยท 12 comments
Issue Type
Others
onnx2tf version number
1.1.25
Download URL for ONNX
gist for reproduction : https://colab.research.google.com/gist/Hyunseok-Kim0/d0aaf6e9ac6fbe461c5f2364db4bc0b2/onnx2tf_20221117.ipynb
Parameter Replacement JSON
N/A
Description
- Purpose: Personal development
- What:
channel_transpose
in common_functions.py is used in arithmetic operations like Add, Sub, Mul, etc. What is the main purpose of this function? When second input has more dimension,channel_transpose
adds additional squeeze layer and changes the output shape, not vice versa.
Please see the gist (https://colab.research.google.com/gist/Hyunseok-Kim0/d0aaf6e9ac6fbe461c5f2364db4bc0b2/onnx2tf_20221117.ipynb). When the output of network isx = x + y
, onnx and tflite has same output shape. Converted tflite has wrong shape forx = y + x
.
ONNX | Correct tflite (x = x + y) | Wrong tflite (x = y + x) |
---|---|---|
Thanks. Please engage in a little discussion with me.
I am aware of that problem. However, I am struggling to implement a realistic measure.
This problem is especially common with Mul
, Add
, Sub
, and Div
. For example, the following Add
operation pattern is available for the 4D NHWC / 5D NDHWC input. My implementation is still rough, so I have kept it simple with the idea of broadcasting the Y side or compressing the dimensions.
All of the following Y expression patterns on ONNX need to be converted to NHWC for processing. Also, the example below is a very simple pattern that only needs to be broadcast with all 1's except for all but one dimension.
e.g.
- pattern.1:
X = [1,128,128,3] (TF input format)
,Y = [3] (3 channel constant of onnx)
- pattern.2:
X = [1,128,128,3] (TF input format)
,Y = [1,3,1,1] (NCHW constant of onnx)
- pattern.3:
X = [1,128,128,3] (TF input format)
,Y = [3,1,1] (CHW constant of onnx)
- pattern.4:X = [1,128,128,3] (TF input format)
,Y = [1,3] (NC constant of onnx)
- pattern.5:X = [1,128,128,3] (TF input format)
,Y = [3,128] (CH constant of onnx)
- pattern.6:
X = [1,128,128,3] (TF input format)
,Y = [1,3,128] (NCH constant of onnx)
- pattern.7:
X = [1,128,128,3] (TF input format)
,Y = [128,128] (HW constant of onnx)
- pattern.8:
X = [1,64,128,128,3] (TF input format)
,Y = [128,128] (HW constant of onnx)
- pattern.9:
X = [1,64,128,128,3] (TF input format)
,Y = [3] (3 channel constant of onnx)
- pattern.10:
X = [1,64,128,128,3] (TF input format)
,Y = [1,128] (CH constant of onnx)
- pattern.11:
X = [1,64,128,128,3] (TF input format)
,Y = [1,1,128,1] (DCHW constant of onnx)
- pattern.12:X = [1,64,128,128,3] (TF input format)
,Y = [64] (D constant of onnx)
- pattern.13:
X = [1,128,128,128,64] (TF input format)
,Y = [128] (? constant of onnx)
- etc...
I have no idea how to successfully implement constant broadcasts in all dimensions up to xD, not just 2D to 5D. Right now I am forced to deal with only a limited pattern. Realistically, I believe we need to not only implement Numpy-style broadcasts, but also an implementation that mechanically reads the constants that assume NCHW format into NHWC format and then broadcasts them.
It is very easy to implement a simple broadcast.
Now I understood how difficult to solve this problem since there is no information to guess the order of tensor during conversion. However, Is comparing input and output shape between onnx and tensorflow not enough? Anyway the onnx model follows numpy broadcasting rule. In that case, the tensor shapes are compared from backward. It looks patterns like 4, 5, 12 cannot exist in onnx.
Considering that only the channel dimension should changed in onnx to tensorflow, I think procedure stated below can work for arbitrary dimension broadcasting.
- if
y
has more dimension, swapx
andy
. - unsqueeze(0)
y
until thex
andy
have same length of dimension. - compare shape of
x
between onnx and tensorflow to figure out where does channel dimension moved. - transpose channel dimension of
y
to match order withx
if needed.
As a result, one reshape layer one transpose layer will be added.
compare shape of x between onnx and tensorflow to figure out where does channel dimension moved.
Unfortunately, there is a problem in shufflenet-based models that prevents locating the channel dimension when all dimensions except batch size are the same. There are quite a few models where the very operation of comparing ONNX shapes to TensorFlow shapes breaks down.
e.g. onnx x: [1,80,80,80]
, [40,40,40]
, etc...
It might be possible to respond in a very limited way... ๐ค
I may still be misunderstanding.
What about comparing intermediate output using dummy input? For the cases you mentioned, brute-force looks like the only solution.
brute-force
Thanks.
I see. This idea had never occurred to me before. I will give it some thought.
Notes on implementation ideas.
- Get the result of inference on the corresponding OP alone with a dummy tensor.
- Flatten the output tensor to 1D and sort in ascending order.
- Either use numpy.ndarray.all to determine an exact match, or loop through the numbers one by one to determine if they are approximate (simple comparison of numbers is likely to cause problems, as small arithmetic errors can occur.)
Need to separate the logic for ambiguous match comparison for each pattern of integers and decimals.
https://numpy.org/doc/stable/reference/generated/numpy.isclose.html
-
channel_transpose
branch test.1, Hyunseok-Kim0 pattern dummy1.onnx.zip
-
channel_transpose
branch,PRelu
test.2 face_recognition_sface_2021dec.onnx
-
channel_transpose
branch, test.3 dummy2.onnx.zip
-
channel_transpose
branch, test.4 dummy2.onnx.ziponnx2tf -i dummy2.onnx -k y
-
channel_transpose
branch, test.5 dummy4.onnx.ziponnx2tf -i dummy4.onnx -kat y
Current explicit_broadcast
has a bug. It returns swapped operand if const_or_var_2.shape is all 1's. This occurs wrong calculation when constant is operand. For example, (1-x) is calculated as (x-1) in current version.
onnx2tf/onnx2tf/utils/common_functions.py
Lines 641 to 651 in 86cc1a0
Also, arithmetic operations between same shape of tensors cannot be done due to wrong transpose_perm
.
If x
is (1, 384, 384, 3) and y
has same shape (1, 384, 384, 3), current version transpose_perm
has value of (0, 2, 3, 1). So y is transposed to (1, 384, 3, 384).
I misunderstood, sorry.
For now, I almost fixed these bugs. Do you mind if I open PR after checking some patterns to make sure bug is fixed?
I see. Thanks.
Current explicit_broadcast has a bug. It returns swapped operand if const_or_var_2.shape is all 1's. This occurs wrong calculation when constant is operand. For example, (1-x) is calculated as (x-1) in current version.
Consider switching the order of decisions as follows.
# If const_or_var_2.shape is all 1's, do not broadcast and return as is
shape_for_judging_skip_processing = [
i if i is not None else INF_INDEX_VALUE for i in const_or_var_2.shape
]
if np.prod(shape_for_judging_skip_processing) == 1:
return const_or_var_1, const_or_var_2
# Swap: len(const_or_var_1.shape) > len(const_or_var_2.shape)
if len(const_or_var_1.shape) < len(const_or_var_2.shape):
const_or_var_1, const_or_var_2 = const_or_var_2, const_or_var_1
graph_node_input_name1, graph_node_input_name2 = graph_node_input_name2, graph_node_input_name1
Also, arithmetic operations between same shape of tensors cannot be done due to wrong transpose_perm.
If x is (1, 384, 384, 3) and y has same shape (1, 384, 384, 3), current version transpose_perm has value of (0, 2, 3, 1). So y is transposed to (1, 384, 3, 384).
Consider adding logic to determine if all dimensions match before calculating transpose_perm
. However, I believe that there are very few situations where tensors of the same shape are entered as constants in the NCHW
phase of ONNX in the first place.
For now, I almost fixed these bugs. Do you mind if I open PR after checking some patterns to make sure bug is fixed?
Sorry. I was so focused on the text pointing out the bug that I missed this last sentence.
Of course. You are welcome. :)