Question about channel_transpose in common_functions.py

Issue Type

Others

onnx2tf version number

1.1.25

Download URL for ONNX

gist for reproduction : https://colab.research.google.com/gist/Hyunseok-Kim0/d0aaf6e9ac6fbe461c5f2364db4bc0b2/onnx2tf_20221117.ipynb

Parameter Replacement JSON

N/A

Description

Purpose: Personal development
What: channel_transpose in common_functions.py is used in arithmetic operations like Add, Sub, Mul, etc. What is the main purpose of this function? When second input has more dimension, channel_transpose adds additional squeeze layer and changes the output shape, not vice versa.
Please see the gist (https://colab.research.google.com/gist/Hyunseok-Kim0/d0aaf6e9ac6fbe461c5f2364db4bc0b2/onnx2tf_20221117.ipynb). When the output of network is x = x + y, onnx and tflite has same output shape. Converted tflite has wrong shape for x = y + x.

ONNX	Correct tflite (x = x + y)	Wrong tflite (x = y + x)

Thanks. Please engage in a little discussion with me.

I am aware of that problem. However, I am struggling to implement a realistic measure.

This problem is especially common with Mul, Add, Sub, and Div. For example, the following Add operation pattern is available for the 4D NHWC / 5D NDHWC input. My implementation is still rough, so I have kept it simple with the idea of broadcasting the Y side or compressing the dimensions.

All of the following Y expression patterns on ONNX need to be converted to NHWC for processing. Also, the example below is a very simple pattern that only needs to be broadcast with all 1's except for all but one dimension.

e.g.

pattern.1: X = [1,128,128,3] (TF input format), Y = [3] (3 channel constant of onnx)
pattern.2: X = [1,128,128,3] (TF input format), Y = [1,3,1,1] (NCHW constant of onnx)
pattern.3: X = [1,128,128,3] (TF input format), Y = [3,1,1] (CHW constant of onnx)
~~- pattern.4: X = [1,128,128,3] (TF input format), Y = [1,3] (NC constant of onnx)~~
~~- pattern.5: X = [1,128,128,3] (TF input format), Y = [3,128] (CH constant of onnx)~~
pattern.6: X = [1,128,128,3] (TF input format), Y = [1,3,128] (NCH constant of onnx)
pattern.7: X = [1,128,128,3] (TF input format), Y = [128,128] (HW constant of onnx)
pattern.8: X = [1,64,128,128,3] (TF input format), Y = [128,128] (HW constant of onnx)
pattern.9: X = [1,64,128,128,3] (TF input format), Y = [3] (3 channel constant of onnx)
pattern.10: X = [1,64,128,128,3] (TF input format), Y = [1,128] (CH constant of onnx)
pattern.11: X = [1,64,128,128,3] (TF input format), Y = [1,1,128,1] (DCHW constant of onnx)
~~- pattern.12: X = [1,64,128,128,3] (TF input format), Y = [64] (D constant of onnx)~~
pattern.13: X = [1,128,128,128,64] (TF input format), Y = [128] (? constant of onnx)
etc...

I have no idea how to successfully implement constant broadcasts in all dimensions up to xD, not just 2D to 5D. Right now I am forced to deal with only a limited pattern. Realistically, I believe we need to not only implement Numpy-style broadcasts, but also an implementation that mechanically reads the constants that assume NCHW format into NHWC format and then broadcasts them.

It is very easy to implement a simple broadcast.

Now I understood how difficult to solve this problem since there is no information to guess the order of tensor during conversion. However, Is comparing input and output shape between onnx and tensorflow not enough? Anyway the onnx model follows numpy broadcasting rule. In that case, the tensor shapes are compared from backward. It looks patterns like 4, 5, 12 cannot exist in onnx.

Considering that only the channel dimension should changed in onnx to tensorflow, I think procedure stated below can work for arbitrary dimension broadcasting.

if y has more dimension, swap x and y.
unsqueeze(0) y until the x and y have same length of dimension.
compare shape of x between onnx and tensorflow to figure out where does channel dimension moved.
transpose channel dimension of y to match order with x if needed.

As a result, one reshape layer one transpose layer will be added.

compare shape of x between onnx and tensorflow to figure out where does channel dimension moved.

Unfortunately, there is a problem in shufflenet-based models that prevents locating the channel dimension when all dimensions except batch size are the same. There are quite a few models where the very operation of comparing ONNX shapes to TensorFlow shapes breaks down.

e.g. onnx x: [1,80,80,80], [40,40,40], etc...

It might be possible to respond in a very limited way... 🤔
I may still be misunderstanding.

What about comparing intermediate output using dummy input? For the cases you mentioned, brute-force looks like the only solution.

brute-force

Thanks.
I see. This idea had never occurred to me before. I will give it some thought.

Notes on implementation ideas.

Get the result of inference on the corresponding OP alone with a dummy tensor.
Flatten the output tensor to 1D and sort in ascending order.
Either use numpy.ndarray.all to determine an exact match, or loop through the numbers one by one to determine if they are approximate (simple comparison of numbers is likely to cause problems, as small arithmetic errors can occur.)

Need to separate the logic for ambiguous match comparison for each pattern of integers and decimals.

https://numpy.org/doc/stable/reference/generated/numpy.isclose.html

channel_transpose branch test.1, Hyunseok-Kim0 pattern dummy1.onnx.zip
channel_transpose branch, PRelu test.2 face_recognition_sface_2021dec.onnx
channel_transpose branch, test.3 dummy2.onnx.zip
channel_transpose branch, test.4 dummy2.onnx.zip
```
onnx2tf -i dummy2.onnx -k y
```
channel_transpose branch, test.5 dummy4.onnx.zip
```
onnx2tf -i dummy4.onnx -kat y
```

https://github.com/PINTO0309/onnx2tf/releases/tag/1.1.26
https://github.com/PINTO0309/onnx2tf/releases/tag/1.1.27

Current explicit_broadcast has a bug. It returns swapped operand if const_or_var_2.shape is all 1's. This occurs wrong calculation when constant is operand. For example, (1-x) is calculated as (x-1) in current version.

onnx2tf/onnx2tf/utils/common_functions.py

Lines 641 to 651 in 86cc1a0

    
           # Swap: len(const_or_var_1.shape) > len(const_or_var_2.shape) 
        
           if len(const_or_var_1.shape) < len(const_or_var_2.shape): 
        
               const_or_var_1, const_or_var_2 = const_or_var_2, const_or_var_1 
        
               graph_node_input_name1, graph_node_input_name2 = graph_node_input_name2, graph_node_input_name1 
        
           # If const_or_var_2.shape is all 1's, do not broadcast and return as is 
        
           shape_for_judging_skip_processing = [ 
        
               i if i is not None else INF_INDEX_VALUE for i in const_or_var_2.shape 
        
           ] 
        
           if np.prod(shape_for_judging_skip_processing) == 1: 
        
               return const_or_var_1, const_or_var_2

Also, arithmetic operations between same shape of tensors cannot be done due to wrong transpose_perm.
If x is (1, 384, 384, 3) and y has same shape (1, 384, 384, 3), current version transpose_perm has value of (0, 2, 3, 1). So y is transposed to (1, 384, 3, 384).
I misunderstood, sorry.

For now, I almost fixed these bugs. Do you mind if I open PR after checking some patterns to make sure bug is fixed?

I see. Thanks.

Current explicit_broadcast has a bug. It returns swapped operand if const_or_var_2.shape is all 1's. This occurs wrong calculation when constant is operand. For example, (1-x) is calculated as (x-1) in current version.

Consider switching the order of decisions as follows.

 # If const_or_var_2.shape is all 1's, do not broadcast and return as is 
 shape_for_judging_skip_processing = [ 
     i if i is not None else INF_INDEX_VALUE for i in const_or_var_2.shape 
 ] 
 if np.prod(shape_for_judging_skip_processing) == 1: 
     return const_or_var_1, const_or_var_2 

 # Swap: len(const_or_var_1.shape) > len(const_or_var_2.shape) 
 if len(const_or_var_1.shape) < len(const_or_var_2.shape): 
     const_or_var_1, const_or_var_2 = const_or_var_2, const_or_var_1 
     graph_node_input_name1, graph_node_input_name2 = graph_node_input_name2, graph_node_input_name1

Also, arithmetic operations between same shape of tensors cannot be done due to wrong transpose_perm.
If x is (1, 384, 384, 3) and y has same shape (1, 384, 384, 3), current version transpose_perm has value of (0, 2, 3, 1). So y is transposed to (1, 384, 3, 384).

Consider adding logic to determine if all dimensions match before calculating transpose_perm. However, I believe that there are very few situations where tensors of the same shape are entered as constants in the NCHW phase of ONNX in the first place.

For now, I almost fixed these bugs. Do you mind if I open PR after checking some patterns to make sure bug is fixed?

Sorry. I was so focused on the text pointing out the bug that I missed this last sentence.

Of course. You are welcome. :)

https://github.com/PINTO0309/onnx2tf/releases/tag/1.1.28

	# Swap: len(const_or_var_1.shape) > len(const_or_var_2.shape)
	if len(const_or_var_1.shape) < len(const_or_var_2.shape):
	const_or_var_1, const_or_var_2 = const_or_var_2, const_or_var_1
	graph_node_input_name1, graph_node_input_name2 = graph_node_input_name2, graph_node_input_name1

	# If const_or_var_2.shape is all 1's, do not broadcast and return as is
	shape_for_judging_skip_processing = [
	i if i is not None else INF_INDEX_VALUE for i in const_or_var_2.shape
	]
	if np.prod(shape_for_judging_skip_processing) == 1:
	return const_or_var_1, const_or_var_2