/onnx2tf

Self-Created Tools to convert ONNX files (NCHW) to TensorFlow/TFLite/Keras format (NHWC). The purpose of this tool is to solve the massive Transpose extrapolation problem in onnx-tensorflow (onnx-tf). I don't need a Star, but give me a pull request.

Primary LanguagePythonMIT LicenseMIT

onnx2tf

Self-Created Tools to convert ONNX files (NCHW) to TensorFlow/TFLite/Keras format (NHWC). The purpose of this tool is to solve the massive Transpose extrapolation problem in onnx-tensorflow (onnx-tf). I don't need a Star, but give me a pull request.

Downloads GitHub Python PyPI CodeQL Model Convert Test Status DOI

Model Conversion Status

https://github.com/PINTO0309/onnx2tf/wiki/model_status

Key concept

  • onnx-tensorflow is a very useful tool, but the performance of the generated TensorFlow models is significantly degraded due to the extrapolation of a large number of Transpose OPs before and after each OP during the format conversion from NCHW to NHWC. Therefore, I will make this tool myself as a derivative tool of onnx-tensorflow without extrapolating Transpose.

  • Most of the internal processing of the tool is full-scratch, but some of the more complex OPs have been adapted from onnx-tensorflow. I am very grateful to the engineers at International Business Machines Corporation / LeapMind / Microsoft / IBM for developing onnx-tensorflow.

  • I have incorporated all my knowledge of model optimization to other models such as TFLite, EdgeTPU, TensorFlow.js and Myriad based on my years of experience implementing openvino2tensorflow and tflite2tensorflow. It probably has the best model optimization performance and conversion efficiency of any tool I have created in the past, and the lowest rate of conversion errors.

  • Supported layers list. Supported layers

  • If you are having trouble with conversion errors, searching for resolved or open issues will almost always solve your problems. Issues are knowledge for engineers around the world.

  • Contributors to this repository should first read Contribution Guide.

    Kazam_screencast_00065_.mp4
  • All OPs are decomposed into primitive operations as much as possible. This is beneficial for lateral deployment of models to frameworks other than TFLite. Therefore, OPs belonging to tf.keras.layers are almost never used, and the tool consists only of tf.xxx. (except for a very few OPs)

  • As I do not want to add more dependent packages, I do not use tensorflow_addons (tfa), but replace it with the standard OP of tensorflow.

  • Not only does it handle conversions of 4-dimensional inputs, such as NCHW to NHWC, but also the number of input dimensions in 3, 5, or even more dimensions. For example, NCDHW to NDHWC, etc. However, since 1-D, 2-D, 3-D and 6-D input may produce patterns that are mechanically difficult to convert, it should be possible to give parameters to externally modify the tool's behavior. See Parameter replacement

  • If there are undefined dimensions in the input OP, the model structure is not fully optimized and conversion errors are very likely to occur.

  • Immediately following a Reshape OP with dimensional compression and dimensional decompression, there is a 95% probability that the model transformation operation will be disrupted and errors will occur. For example, patterns such as [1,200,200,5] -> [1,200,-1] or [10,20,30,40,50] -> [10,2,10,30,10,4,50] or Flatten. See #8 Not able to reshape input in replace.json, or #15 Conv layer shape wrong, or #18 Question about channel_transpose in common_functions.py, or #105 [MobileFormer]Converted model outputs values mismatch with original ones., or #133 When Onnx Matmul inputs have different dimension.

  • TensorFlow's Convolution does not have an equivalent operation to ONNX's Padding operation. Therefore, a Pad OP is inserted immediately before a Convolution with Padding of size greater than 1.

  • Support conversion to TensorFlow saved model and TFLite (Float32/Float16/INT8).

  • Files exceeding the Protocol Buffers file size limit of 2GB are not supported. Therefore, the external format is not supported at the initial stage of tool creation.

  • If there are ONNX OPs that are not supported by TensorFlow, use simple-onnx-processing-tools to replace them with harmless OPs in advance and then use this tool to convert them. In other words, you can convert any model with your efforts.

  • ONNX splitting, merging, generating OPs, rewriting OP attributes, BGR<->RGB conversion, converting to JSON and editing in the IDE, batch size changes for undefined dimensions, and various other processing can be done with the simple-onnx-processing-tools. Therefore, it is recommended that models with very complex structures be converted to TFLite after modifying the structure beforehand.

  • BatchNormalization supports only inference mode.

  • LayerNormalization supports only inference mode.

  • Only for opset=11 or higher

  • If you do not like the generated TFLite OP name, edit it using tflite2json2tflite.

  • The generated Keras models cannot be used for retraining. If you want to train, you must build your own model.

  • When converting to TensorFlow.js, CoreML, etc., please generate saved_model with the --output_signaturedefs option and use the generated saved_model to convert with various converters. tensorflowjs_converter, coremltools, edgetpu_compilier, etc... If this option is not enabled, saved_model records only the minimum necessary information and its size is minimized. When this option is enabled, saved_model records the maximum amount of information, and instead of being maximized in size, the output is in a format that supports conversion to other frameworks. It can also be used for serving.

  • There are many OPs on ONNX that do not support EdgeTPU. Therefore, if you need to generate an EdgeTPU model, please specify --replace_***_to_pseudo_*** to convert your model. onnx2tf will attempt to replace the OP with an EdgeTPU-compatible OP whenever possible.

  • The main factors that cause accuracy degradation after model conversion are as follows

  1. differences in Padding specifications
  2. difference in Python division specification in the process of model transformation (error due to even rounding)
  3. Divide epsilon without consideration
  4. deprecated TrueDivision
  5. support difference of powers
  6. differences in interpolation operation specifications during resizing
  7. Difference in arithmetic precision supported by each operation
  8. Calculation error due to scaling up or down by specifying a scale when resizing images

The above differences often cannot be dealt with by simply converting the model in a straightforward manner. Therefore, you need to replace the model yourself in advance with an operation that is less prone to errors.

  • Support for INT8 Quantization, Full INT8 Quantization, INT8 Quantization with INT16 activation, Full INT8 Quantization with INT16 activation and Dynamic Range Quantization.
  • Support for Per-Channel Quantization and Per-Tensor Quantization.
  • Support for GroupConvolution.
  • TFLite does not support TrueDiv(INT), so TrueDiv is avoided if possible.
  • Implement the Resize process for the 5D tensor.
  • Add process to replace Asin with pseudo-Asin.
  • Add process to replace Acos with pseudo-Acos.
  • Add process to replace Abs with pseudo-Abs.
  • Add process to replace GatherND with pseudo-GatherND.
  • Add process to replace HardSwish with pseudo-HardSwish.
  • Add process to replace GridSample with pseudo-GridSample.
  • Add process to replace PRelu with pseudo-PRelu.
  • Add process to replace LeakyRelu with pseudo-LeakyRelu.
  • Add process to replace Power with pseudo-Power.
  • Add process to replace Neg with pseudo-Neg.
  • Add process to replace ArgMax with pseudo-ArgMax.
  • Add process to replace Erf with pseudo-Erf.
  • Added option to fix dynamic batch size N to a specified number.
  • Added option to overwrite dynamic shape input OPs with static shape. --overwrite_input_shape
  • Output in Keras H5 format.
  • Automatically run onnx-simplifier (onnxsim) backend and optimize onnx files before model transformation.
  • Added the ability to automatically generate each OP name and assign OP names to ONNX files in the old format.
  • Supports model splitting. Interrupts model transformation at the specified output name and outputs the model partitioned into subgraphs.

Demo

Video speed is adjusted approximately 50 times slower than actual speed. render1665941718294

Environment

  • onnx
  • onnxruntime
  • onnx-simplifier
  • onnx_graphsurgeon
  • simple_onnx_processing_tools
  • tensorflow==2.10.0

Sample Usage

  • HostPC
    $ docker run --rm -it \
    -v `pwd`:/workdir \
    -w /workdir \
    ghcr.io/pinto0309/onnx2tf:1.5.36
    
    or
    
    $ pip install -U onnx \
    && pip install -U nvidia-pyindex \
    && pip install -U onnx-graphsurgeon \
    && pip install -U onnxruntime \
    && pip install -U onnxsim \
    && pip install -U simple_onnx_processing_tools \
    && pip install -U onnx2tf \
    && pip install -U h5py==3.7.0
    
    or
    
    $ pip install -e .
    

or

  • Google Colaboratory Python3.8+
    !sudo add-apt-repository -y ppa:deadsnakes/ppa
    !sudo apt-get -y update
    !sudo apt-get -y install python3.9
    !sudo apt-get -y install python3.9-dev
    !sudo apt-get -y install python3-pip
    !sudo apt-get -y install python3.9-distutils
    !python3.9 -m pip install -U setuptools \
      && python3.9 -m pip install -U pip \
      && python3.9 -m pip install -U distlib
    !sudo update-alternatives --install /usr/bin/python3 python3 /usr/bin/python3.7 1
    !sudo update-alternatives --install /usr/bin/python3 python3 /usr/bin/python3.9 2
    !python3.9 -m pip install tensorflow==2.10.0 \
      && python3.9 -m pip install -U onnx \
      && python3.9 -m pip install -U nvidia-pyindex \
      && python3.9 -m pip install -U onnx-graphsurgeon \
      && python3.9 -m pip install -U onnxruntime \
      && python3.9 -m pip install -U onnxsim \
      && python3.9 -m pip install -U simple_onnx_processing_tools \
      && python3.9 -m pip install -U onnx2tf \
      && python3.9 -m pip install -U protobuf==3.20.3 \
      && python3.9 -m pip install -U h5py==3.7.0
    

Run test.

# Float32, Float16
$ wget https://github.com/PINTO0309/onnx2tf/releases/download/0.0.2/resnet18-v1-7.onnx
$ onnx2tf -i resnet18-v1-7.onnx

# saved_model with signaturedefs added
# Output in the form of saved_model that can be used for serving or conversion to other frameworks
$ wget https://github.com/PINTO0309/onnx2tf/releases/download/0.0.2/resnet18-v1-7.onnx
$ onnx2tf -i resnet18-v1-7.onnx -osd

# Keras h5 format
$ wget https://github.com/PINTO0309/onnx2tf/releases/download/0.0.2/resnet18-v1-7.onnx
$ onnx2tf -i resnet18-v1-7.onnx -oh5

# INT8 Quantization, Full INT8 Quantization
# INT8 Quantization with INT16 activation, Full INT8 Quantization with INT16 activation
# Dynamic Range Quantization
$ wget https://github.com/PINTO0309/onnx2tf/releases/download/1.1.1/emotion-ferplus-8.onnx
# INT8 Quantization (per-channel)
$ onnx2tf -i emotion-ferplus-8.onnx -oiqt
# INT8 Quantization (per-tensor)
$ onnx2tf -i emotion-ferplus-8.onnx -oiqt -qt per-tensor

# Parameter replacement (Resize,Transpose,Softmax)
$ rm replace.json
$ wget https://github.com/PINTO0309/onnx2tf/releases/download/1.1.27/human_segmentation_pphumanseg_2021oct.onnx
$ wget https://github.com/PINTO0309/onnx2tf/releases/download/1.1.27/replace.json
$ onnx2tf -i human_segmentation_pphumanseg_2021oct.onnx -prf replace.json

Perform error checking of ONNX output and TensorFlow output. Verify that the error of all outputs, one operation at a time, is below a certain threshold. Automatically determines before and after which OPs the tool's automatic conversion of the model failed. Know where dimensional compression, dimensional expansion, and dimensional transposition by Reshape and Traspose are failing. Once you have identified the problem area, you can refer to the tutorial on Parameter replacement to modify the tool's behavior.

$ onnx2tf -i mobilenetv2-12.onnx -ois input:1,3,224,224 -cotof -cotoa 1e-1

Kazam_screencast_00108_

CLI Parameter


$ onnx2tf -h

usage: onnx2tf
[-h]
(-i INPUT_ONNX_FILE_PATH | -V)
[-o OUTPUT_FOLDER_PATH]
[-osd]
[-oh5]
[-ow]
[-oiqt]
[-qt {per-channel,per-tensor}]
[-qcind INPUT_NAME NUMPY_FILE_PATH MEAN STD]
[-ioqd {int8,uint8}]
[-nuo]
[-nuonag]
[-b BATCH_SIZE]
[-ois OVERWRITE_INPUT_SHAPE [OVERWRITE_INPUT_SHAPE ...]]
[-k KEEP_NCW_OR_NCHW_OR_NCDHW_INPUT_NAMES [KEEP_NCW_OR_NCHW_OR_NCDHW_INPUT_NAMES ...]]
[-kt KEEP_NWC_OR_NHWC_OR_NDHWC_INPUT_NAMES [KEEP_NWC_OR_NHWC_OR_NDHWC_INPUT_NAMES ...]]
[-kat KEEP_SHAPE_ABSOLUTELY_INPUT_NAMES [KEEP_SHAPE_ABSOLUTELY_INPUT_NAMES ...]]
[-onimc OUTPUT_NAMES [OUTPUT_NAMES ...]]
[-dgc]
[-ebu]
[-dsft]
[-nodafc]
[-ofgd]
[-rari64 | -rarf32 | -rafi64 | -raff32]
[-fasr FUSED_ARGMAX_SCALE_RATIO]
[-rasin]
[-racos]
[-rabs]
[-rpr]
[-rlr]
[-rpw]
[-rgn]
[-rng]
[-rhs]
[-rerf]
[-me MVN_EPSILON]
[-prf PARAM_REPLACEMENT_FILE]
[-cgdc]
[-coto | -cotof]
[-coton]
[-cotor CHECK_ONNX_TF_OUTPUTS_ELEMENTWISE_CLOSE_RTOL]
[-cotoa CHECK_ONNX_TF_OUTPUTS_ELEMENTWISE_CLOSE_ATOL]
[-n]

optional arguments:
  -h, --help
    show this help message and exit

  -i INPUT_ONNX_FILE_PATH, --input_onnx_file_path INPUT_ONNX_FILE_PATH
    Input onnx file path.

  -V, --version
    Show version and exit.

  -o OUTPUT_FOLDER_PATH, --output_folder_path OUTPUT_FOLDER_PATH
    Output folder path. Default: "saved_model"

  -osd, --output_signaturedefs
    Signature is added to the output for serving or for conversion
    to other model formats. However, this can significantly reduce the speed
    of model conversion and significant increase the size of the model.

  -oh5, --output_h5
    Output model in Keras (hdf5) format.

  -ow, --output_weights
    Output weights in hdf5 format.

  -oiqt, --output_integer_quantized_tflite
    Output of integer quantized tflite.

  -qt {per-channel,per-tensor}, --quant_type {per-channel,per-tensor}
    Selects whether "per-channel" or "per-tensor" quantization is used.
    Default: "per-channel"

  -qcind INPUT_NAME NUMPY_FILE_PATH MEAN STD, \
    --quant_calib_input_op_name_np_data_path INPUT_NAME NUMPY_FILE_PATH MEAN STD
    INPUT Name of OP and path of calibration data file (Numpy) for quantization and mean and std.
    The specification can be omitted only when the input OP is a single 4D tensor image data.
    If omitted, it is automatically calibrated using 20 normalized MS-COCO images.
    The type of the input OP must be Float32.
    Data for calibration must be pre-normalized to a range of 0 to 1.
    -qcind {input_op_name} {numpy_file_path} {mean} {std}
    Numpy file paths must be specified the same number of times as the number of input OPs.
    Normalize the value of the input OP based on the tensor specified in mean and std.
    (input_value - mean) / std
    Tensors in Numpy file format must be in dimension order after conversion to TF.
    Note that this is intended for deployment on low-resource devices,
    so the batch size is limited to 1 only.

    e.g.
    The example below shows a case where there are three input OPs.
    Assume input0 is 128x128 RGB image data.
    In addition, input0 should be a value that has been divided by 255
    in the preprocessing and normalized to a range between 0 and 1.
    input1 and input2 assume the input of something that is not an image.
    Because input1 and input2 assume something that is not an image,
    the divisor is not 255 when normalizing from 0 to 1.
    "n" is the number of calibration data.

    ONNX INPUT shapes:
      input0: [n,3,128,128]
        mean: [1,3,1,1] -> [[[[0.485]],[[0.456]],[[0.406]]]]
        std : [1,3,1,1] -> [[[[0.229]],[[0.224]],[[0.225]]]]
      input1: [n,64,64]
        mean: [1,64] -> [[0.1, ..., 0.64]]
        std : [1,64] -> [[0.05, ..., 0.08]]
      input2: [n,5]
        mean: [1] -> [0.3]
        std : [1] -> [0.07]

    TensorFlow INPUT shapes (Numpy file ndarray shapes):
      input0: [n,128,128,3]
        mean: [1,1,1,3] -> [[[[0.485, 0.456, 0.406]]]]
        std : [1,1,1,3] -> [[[[0.229, 0.224, 0.225]]]]
      input1: [n,64,64]
        mean: [1,64] -> [[0.1, ..., 0.64]]
        std : [1,64] -> [[0.05, ..., 0.08]]
      input2: [n,5]
        mean: [1] -> [0.3]
        std : [1] -> [0.07]

    -qcind "input0" "../input0.npy" [[[[0.485, 0.456, 0.406]]]] [[[[0.229, 0.224, 0.225]]]]
    -qcind "input1" "./input1.npy" [[0.1, ..., 0.64]] [[0.05, ..., 0.08]]
    -qcind "input2" "input2.npy" [0.3] [0.07]

  -ioqd {int8,uint8}, --input_output_quant_dtype {int8,uint8}
    Input and Output dtypes when doing Full INT8 Quantization.
    "int8"(default) or "uint8"

  -nuo, --not_use_onnxsim
    No optimization by onnx-simplifier is performed.
    If this option is used, the probability of a conversion error is very high.

  -nuonag, --not_use_opname_auto_generate
    Automatic generation of each OP name in the old format ONNX file
    and assignment of OP name are not performed.

  -b BATCH_SIZE, --batch_size BATCH_SIZE
    Fixes the dynamic batch size to the specified numeric batch size.
    A value of 1 or more must be specified.

  -ois OVERWRITE_INPUT_SHAPE [OVERWRITE_INPUT_SHAPE ...], \
      --overwrite_input_shape OVERWRITE_INPUT_SHAPE [OVERWRITE_INPUT_SHAPE ...]
    Overwrite the input shape.
    The format is
    "i1:dim0,...,dimN" "i2:dim0,...,dimN" "i3:dim0,...,dimN"
    When there is only one input, for example,
    "data:1,3,224,224"
    When there are multiple inputs, for example,
    "data1:1,3,224,224" "data2:1,3,112" "data3:5"
    A value of 1 or more must be specified.
    Numerical values other than dynamic dimensions are ignored.
    Ignores --batch_size if specified at the same time as --batch_size.

  -k KEEP_NCW_OR_NCHW_OR_NCDHW_INPUT_NAMES [KEEP_NCW_OR_NCHW_OR_NCDHW_INPUT_NAMES ...], \
      --keep_ncw_or_nchw_or_ncdhw_input_names KEEP_NCW_OR_NCHW_OR_NCDHW_INPUT_NAMES \
          [KEEP_NCW_OR_NCHW_OR_NCDHW_INPUT_NAMES ...]
    Holds the NCW or NCHW or NCDHW of the input shape for the specified INPUT OP names.
    If a nonexistent INPUT OP name is specified, it is ignored.
    Valid only for 3D, 4D and 5D input tensors.
    e.g. --keep_ncw_or_nchw_or_ncdhw_input_names "input0" "input1" "input2"

  -kt KEEP_NWC_OR_NHWC_OR_NDHWC_INPUT_NAMES [KEEP_NWC_OR_NHWC_OR_NDHWC_INPUT_NAMES ...], \
      --keep_nwc_or_nhwc_or_ndhwc_input_names KEEP_NWC_OR_NHWC_OR_NDHWC_INPUT_NAMES \
          [KEEP_NWC_OR_NHWC_OR_NDHWC_INPUT_NAMES ...]
    Holds the NWC or NHWC or NDHWC of the input shape for the specified INPUT OP names.
    If a nonexistent INPUT OP name is specified, it is ignored.
    If the input OP name is the same as the input OP name specified
    in the keep_ncw_or_nchw_or_ncdhw_input_names option, it is ignored.
    Valid only for 3D, 4D and 5D input tensors.
    e.g. --keep_nwc_or_nhwc_or_ndhwc_input_names "input0" "input1" "input2"

  -kat KEEP_SHAPE_ABSOLUTELY_INPUT_NAMES [KEEP_SHAPE_ABSOLUTELY_INPUT_NAMES ...], \
      --keep_shape_absolutely_input_names KEEP_SHAPE_ABSOLUTELY_INPUT_NAMES \
        [KEEP_SHAPE_ABSOLUTELY_INPUT_NAMES ...]
    Name of the INPUT that unconditionally maintains its shape.
    If a nonexistent INPUT OP name is specified, it is ignored.
    e.g. --keep_shape_absolutely_input_names "input0" "input1" "input2"

  -onimc OUTPUT_NAMES [OUTPUT_NAMES ...], \
      --output_names_to_interrupt_model_conversion OUTPUT_NAMES [OUTPUT_NAMES ...]
    Output names that interrupt model conversion.
    Interrupts model transformation at the specified output name and outputs the
    model partitioned into subgraphs.
    e.g. --output_names_to_interrupt_model_conversion "output0" "output1" "output2"

  -dgc, --disable_group_convolution
    Disable GroupConvolution and replace it with SeparableConvolution for
    output to saved_model format.

  -ebu, --enaable_batchmatmul_unfold
    BatchMatMul is separated batch by batch to generate a primitive MatMul.

  -dsft, --disable_suppression_flextranspose
    Disables FlexTranspose generation suppression.

  -nodafc, --number_of_dimensions_after_flextranspose_compression
    Number of Transpose OP dimensions generated after avoiding FlexTranspose generation.
    Default: 5

  -ofgd, --optimization_for_gpu_delegate
    Replace operations that do not support gpu delegate with those
    that do as much as possible.

  -rari64, --replace_argmax_to_reducemax_and_indicies_is_int64
    Replace ArgMax with a ReduceMax. The returned indicies are int64.
    Only one of replace_argmax_to_reducemax_and_indicies_is_int64
    and replace_argmax_to_reducemax_and_indicies_is_float32
    and replace_argmax_to_fused_argmax_and_indicies_is_int64
    and replace_argmax_to_fused_argmax_and_indicies_is_float32 can be specified.

  -rarf32, --replace_argmax_to_reducemax_and_indicies_is_float32
    Replace ArgMax with a ReduceMax. The returned indicies are float32.
    Only one of replace_argmax_to_reducemax_and_indicies_is_int64
    and replace_argmax_to_reducemax_and_indicies_is_float32
    and replace_argmax_to_fused_argmax_and_indicies_is_int64
    and replace_argmax_to_fused_argmax_and_indicies_is_float32 can be specified.

  -rafi64, --replace_argmax_to_fused_argmax_and_indicies_is_int64
    Replace ArgMax with a Fused_ArgMax. The returned indicies are int64.
    It improves inference speed at the cost of a small sacrifice in accuracy.
    See. https://github.com/tensorflow/models/tree/master/official/projects/edgetpu/vision#argmax-fusion-to-improve-segmentation-model-latency
    Currently, only 4D tensors are supported.
    Only one of replace_argmax_to_reducemax_and_indicies_is_int64
    and replace_argmax_to_reducemax_and_indicies_is_float32
    and replace_argmax_to_fused_argmax_and_indicies_is_int64
    and replace_argmax_to_fused_argmax_and_indicies_is_float32 can be specified.

  -raff32, --replace_argmax_to_fused_argmax_and_indicies_is_float32
    Replace ArgMax with a Fused_ArgMax. The returned indicies are float32.
    It improves inference speed at the cost of a small sacrifice in accuracy.
    See. https://github.com/tensorflow/models/tree/master/official/projects/edgetpu/vision#argmax-fusion-to-improve-segmentation-model-latency
    Currently, only 4D tensors are supported.
    Only one of replace_argmax_to_reducemax_and_indicies_is_int64
    and replace_argmax_to_reducemax_and_indicies_is_float32
    and replace_argmax_to_fused_argmax_and_indicies_is_int64
    and replace_argmax_to_fused_argmax_and_indicies_is_float32 can be specified.

  -fasr FUSED_ARGMAX_SCALE_RATIO, --fused_argmax_scale_ratio FUSED_ARGMAX_SCALE_RATIO
    For Fused ArgMax.
    Scale ratio when generating Fused ArgMax.
    0.0 < fused_argmax_scale_ratio <= 1.0
    Default: 0.5

  -rasin, --replace_asin_to_pseudo_asin
    Replace Asin with a pseudo Asin.

  -racos, --replace_acos_to_pseudo_acos
    Replace Acos with a pseudo Acos.

  -rabs, --replace_abs_to_pseudo_abs
    Replace Abs with a pseudo Abs.

  -rpr, --replace_prelu_to_pseudo_prelu
    Replace PReLU with a pseudo PReLU.

  -rlr, --replace_leakyrelu_to_pseudo_leakyrelu
    Replace LeakyReLU with a pseudo LeakyReLU.

  -rpw, --replace_power_to_pseudo_power
    Replace Power with a pseudo Power.

  -rgn, --replace_gathernd_to_pseudo_gathernd
    Replace GatherND with a pseudo GatherND.

  -rng, --replace_neg_to_pseudo_neg
    Replace Neg with a pseudo Neg.

  -rhs, --replace_hardswish_to_pseudo_hardswish
    Replace HardSwish with a pseudo HardSwish.

  -rerf, --replace_erf_to_pseudo_erf
    Replace Erf with a pseudo Erf.

  -me, --mvn_epsilon
    For MeanVarianceNormalization.
    The number to be added to the variance to avoid division by zero
    when normalizing the value.
    (input_tensor - mean) / tf.sqrt(variance + mvn_epsilon)
    Default: 0.0000000001

  -prf PARAM_REPLACEMENT_FILE, --param_replacement_file PARAM_REPLACEMENT_FILE
    Parameter replacement file path. (.json)

  -cgdc, --check_gpu_delegate_compatibility
    Run TFLite ModelAnalyzer on the generated Float16 tflite model
    to check if the model can be supported by GPU Delegate.
    e.g.
    """
    === TFLite ModelAnalyzer ===

    Your TFLite model has '1' subgraph(s). In the subgraph description below,
    T# represents the Tensor numbers. For example, in Subgraph#0, the RESHAPE op takes
    tensor #0 and tensor #6 as input and produces tensor #7 as output.

    Subgraph#0 main(T#0) -> [T#17]
      Op#0 RESHAPE(T#0, T#6[2, 8, 8, 3, 2, ...]) -> [T#7]
      Op#1 SPLIT(T#5[0], T#7) -> [T#8, T#9]
      Op#2 RESHAPE(T#8, T#1[8, 8, 3, 2, 2]) -> [T#10]
      Op#3 TRANSPOSE(T#10, T#4[0, 3, 1, 4, 2]) -> [T#11]
      Op#4 RESHAPE(T#11, T#2[1, 8, 2, 8, 2, ...]) -> [T#12]
      Op#5 RESHAPE(T#9, T#1[8, 8, 3, 2, 2]) -> [T#13]
      Op#6 TRANSPOSE(T#13, T#4[0, 3, 1, 4, 2]) -> [T#14]
      Op#7 RESHAPE(T#14, T#2[1, 8, 2, 8, 2, ...]) -> [T#15]
      Op#8 CONCATENATION(T#12, T#15) -> [T#16]
      Op#9 RESHAPE(T#16, T#3[2, 16, 16, 3]) -> [T#17]

    Tensors of Subgraph#0
      T#0(inputs_0) shape:[2, 8, 8, 12], type:FLOAT32
      T#1(model/tf.compat.v1.squeeze_2/Squeeze) shape:[5], type:INT32 RO 20 bytes, data:[8, 8, 3, 2, 2]
      T#2(model/tf.expand_dims_1/ExpandDims) shape:[6], type:INT32 RO 24 bytes, data:[1, 8, 2, 8, 2, ...]
      T#3(model/tf.reshape_1/Reshape/shape) shape:[4], type:INT32 RO 16 bytes, data:[2, 16, 16, 3]
      T#4(model/tf.compat.v1.transpose/transpose/perm) shape:[5], type:INT32 RO 20 bytes, data:[0, 3, 1, 4, 2]
      T#5(model/tf.concat/concat/axis) shape:[], type:INT32 RO 4 bytes, data:[0]
      T#6(model/tf.reshape/Reshape/shape) shape:[6], type:INT32 RO 24 bytes, data:[2, 8, 8, 3, 2, ...]
      T#7(model/tf.reshape/Reshape) shape:[2, 8, 8, 3, 2, 2], type:FLOAT32
      T#8(model/tf.split/split) shape:[1, 8, 8, 3, 2, 2], type:FLOAT32
      T#9(model/tf.split/split1) shape:[1, 8, 8, 3, 2, 2], type:FLOAT32
      T#10(model/tf.compat.v1.squeeze_1/Squeeze) shape:[8, 8, 3, 2, 2], type:FLOAT32
      T#11(model/tf.compat.v1.transpose/transpose) shape:[8, 2, 8, 2, 3], type:FLOAT32
      T#12(model/tf.expand_dims/ExpandDims) shape:[1, 8, 2, 8, 2, 3], type:FLOAT32
      T#13(model/tf.compat.v1.squeeze_2/Squeeze1) shape:[8, 8, 3, 2, 2], type:FLOAT32
      T#14(model/tf.compat.v1.transpose_1/transpose) shape:[8, 2, 8, 2, 3], type:FLOAT32
      T#15(model/tf.expand_dims_1/ExpandDims1) shape:[1, 8, 2, 8, 2, 3], type:FLOAT32
      T#16(model/tf.concat/concat) shape:[2, 8, 2, 8, 2, 3], type:FLOAT32
      T#17(Identity) shape:[2, 16, 16, 3], type:FLOAT32

    Your model looks compatibile with GPU delegate with TFLite runtime version 2.10.0.
    But it doesn't guarantee that your model works well with GPU delegate.
    There could be some runtime incompatibililty happen.
    ---------------------------------------------------------------
                  Model size:       2988 bytes
        Non-data buffer size:       2757 bytes (92.27 %)
      Total data buffer size:        231 bytes (07.73 %)
        (Zero value buffers):          4 bytes (00.13 %)

    * Buffers of TFLite model are mostly used for constant tensors.
      And zero value buffers are buffers filled with zeros.
      Non-data buffers area are used to store operators, subgraphs and etc.
      You can find more details from https://github.com/tensorflow/tensorflow/blob/master/tensorflow/lite/schema/schema.fbs
    """

  -coto, --check_onnx_tf_outputs_elementwise_close
    Returns "Matches" if the output of onnx and the output of TF are
    within acceptable proximity element by element.
    Returns "Unmatched" if the output of onnx and the output of TF are
    not within acceptable proximity element by element.
    If the output of onnx is 1D, it returns "Skipped" and skips the comparison
    between the output of onnx and that of TF. This is because when undefined
    dimensions are present, a situation often arises where very large index
    values are compared, causing OutOfMemory.
    Only the output content of the models final output OP is checked.

  -cotof, --check_onnx_tf_outputs_elementwise_close_full
    Returns "Matches" if the output of onnx and the output of TF are
    within acceptable proximity element by element.
    Check the output of all OPs in sequence from the beginning,
    including all but the final output OP of the model.
    Returns "Unmatched" if the output of onnx and the output of TF are
    not within acceptable proximity element by element.
    If the output of onnx is 1D, it returns "Skipped" and skips the comparison
    between the output of onnx and that of TF. This is because when undefined
    dimensions are present, a situation often arises where very large index
    values are compared, causing OutOfMemory.
    It is very time consuming because it performs as many inferences as
    there are operations.

  -coton, --check_onnx_tf_outputs_sample_data_normalization
    norm: Validate using random data normalized to the range 0.0 to 1.0
    denorm: Validate using random data in the range 0.0 to 255.0
    If there is a normalization layer at the model's entry point, or
    if the model was trained on denormalized data, "denorm" must be specified.
    Default: "norm"

  -cotor CHECK_ONNX_TF_OUTPUTS_ELEMENTWISE_CLOSE_RTOL,\
    --check_onnx_tf_outputs_elementwise_close_rtol CHECK_ONNX_TF_OUTPUTS_ELEMENTWISE_CLOSE_RTOL
    The relative tolerance parameter.
    Default: 0.0

  -cotoa CHECK_ONNX_TF_OUTPUTS_ELEMENTWISE_CLOSE_ATOL,\
    --check_onnx_tf_outputs_elementwise_close_atol CHECK_ONNX_TF_OUTPUTS_ELEMENTWISE_CLOSE_ATOL
    The absolute tolerance parameter.
    Default: 1e-4

  -n, --non_verbose
    Do not show all information logs. Only error logs are displayed.

In-script Usage

>>> from onnx2tf import convert
>>> help(convert)

Help on function convert in module onnx2tf:

convert(
  input_onnx_file_path: Union[str, NoneType] = '',
  onnx_graph: Union[onnx.onnx_ml_pb2.ModelProto, NoneType] = None,
  output_folder_path: Union[str, NoneType] = 'saved_model',
  output_signaturedefs: Optional[bool] = False,
  output_h5: Optional[bool] = False,
  output_weights: Optional[bool] = False,
  output_integer_quantized_tflite: Optional[bool] = False,
  quant_type: Optional[str] = 'per-channel',
  quant_calib_input_op_name_np_data_path: Optional[List] = None,
  input_output_quant_dtype: Optional[str] = 'int8',
  not_use_onnxsim: Optional[bool] = False,
  not_use_opname_auto_generate: Optional[bool] = False,
  batch_size: Union[int, NoneType] = None,
  overwrite_input_shape: Union[List[str], NoneType] = None,
  keep_ncw_or_nchw_or_ncdhw_input_names: Union[List[str], NoneType] = None,
  keep_nwc_or_nhwc_or_ndhwc_input_names: Union[List[str], NoneType] = None,
  keep_shape_absolutely_input_names: Optional[List[str]] = None,
  output_names_to_interrupt_model_conversion: Union[List[str], NoneType] = None,
  disable_group_convolution: Union[bool, NoneType] = False,
  enaable_batchmatmul_unfold: Optional[bool] = False,
  disable_suppression_flextranspose: Optional[bool] = False,
  number_of_dimensions_after_flextranspose_compression: Optional[int] = 5,
  optimization_for_gpu_delegate: Optional[bool] = False,
  replace_argmax_to_reducemax_and_indicies_is_int64: Union[bool, NoneType] = False,
  replace_argmax_to_reducemax_and_indicies_is_float32: Union[bool, NoneType] = False,
  replace_argmax_to_fused_argmax_and_indicies_is_int64: Union[bool, NoneType] = False,
  replace_argmax_to_fused_argmax_and_indicies_is_float32: Union[bool, NoneType] = False,
  fused_argmax_scale_ratio: Union[float, NoneType] = 0.5,
  replace_asin_to_pseudo_asin: Union[bool, NoneType] = False,
  replace_acos_to_pseudo_acos: Union[bool, NoneType] = False,
  replace_abs_to_pseudo_abs: Union[bool, NoneType] = False,
  replace_prelu_to_pseudo_prelu: Union[bool, NoneType] = False,
  replace_leakyrelu_to_pseudo_leakyrelu: Union[bool, NoneType] = False,
  replace_power_to_pseudo_power: Optional[bool] = False,
  replace_gathernd_to_pseudo_gathernd: Optional[bool] = False,
  replace_neg_to_pseudo_neg: Optional[bool] = False,
  replace_hardswish_to_pseudo_hardswish: Optional[bool] = False,
  replace_erf_to_pseudo_erf: Optional[bool] = False,
  mvn_epsilon: Union[float, NoneType] = 0.0000000001,
  param_replacement_file: Optional[str] = '',
  check_gpu_delegate_compatibility: Optional[bool] = False,
  check_onnx_tf_outputs_elementwise_close: Optional[bool] = False,
  check_onnx_tf_outputs_elementwise_close_full: Optional[bool] = False,
  check_onnx_tf_outputs_sample_data_normalization: Optional[str] = 'norm',
  check_onnx_tf_outputs_elementwise_close_rtol: Optional[float] = 0.0,
  check_onnx_tf_outputs_elementwise_close_atol: Optional[float] = 1e-4,
  non_verbose: Union[bool, NoneType] = False
) -> keras.engine.training.Model

    Convert ONNX to TensorFlow models.

    Parameters
    ----------
    input_onnx_file_path: Optional[str]
      Input onnx file path.
      Either input_onnx_file_path or onnx_graph must be specified.

    onnx_graph: Optional[onnx.ModelProto]
      onnx.ModelProto.
      Either input_onnx_file_path or onnx_graph must be specified.
      onnx_graph If specified, ignore input_onnx_file_path and process onnx_graph.

    output_folder_path: Optional[str]
      Output tensorflow model folder path.
      Default: "saved_model"

    output_signaturedefs: Optional[bool]
      Signature is added to the output for serving or for conversion
      to other model formats. However, this can significantly reduce the speed
      of model conversion and significant increase the size of the model.

    output_h5: Optional[bool]
      Output model in Keras H5 format.

    output_weights: Optional[bool]
        Output weights in hdf5 format.

    output_integer_quantized_tflite: Optional[bool]
      Output of integer quantized tflite.

    quant_type: Optional[str]
      Selects whether "per-channel" or "per-tensor" quantization is used.
      Default: "per-channel"

    quant_calib_input_op_name_np_data_path: Optional[List]
      --quant_calib_input_op_name_np_data_path INPUT_NAME NUMPY_FILE_PATH MEAN STD
      INPUT Name of OP and path of calibration data file (Numpy) for quantization and mean and std.
      The specification can be omitted only when the input OP is a single 4D tensor image data.
      If omitted, it is automatically calibrated using 20 normalized MS-COCO images.
      The type of the input OP must be Float32.
      Data for calibration must be pre-normalized to a range of 0 to 1.
      -qcind {input_op_name} {numpy_file_path} {mean} {std}
      Numpy file paths must be specified the same number of times as the number of input OPs.
      Normalize the value of the input OP based on the tensor specified in mean and std.
      (input_value - mean) / std
      Tensors in Numpy file format must be in dimension order after conversion to TF.
      Note that this is intended for deployment on low-resource devices,
      so the batch size is limited to 1 only.

      e.g.
      The example below shows a case where there are three input OPs.
      Assume input0 is 128x128 RGB image data.
      In addition, input0 should be a value that has been divided by 255
      in the preprocessing and normalized to a range between 0 and 1.
      input1 and input2 assume the input of something that is not an image.
      Because input1 and input2 assume something that is not an image,
      the divisor is not 255 when normalizing from 0 to 1.
      "n" is the number of calibration data.

      ONNX INPUT shapes:
        input0: [n,3,128,128]
          mean: [1,3,1,1] -> [[[[0.485]],[[0.456]],[[0.406]]]]
          std : [1,3,1,1] -> [[[[0.229]],[[0.224]],[[0.225]]]]
        input1: [n,64,64]
          mean: [1,64] -> [[0.1, ..., 0.64]]
          std : [1,64] -> [[0.05, ..., 0.08]]
        input2: [n,5]
          mean: [1] -> [0.3]
          std : [1] -> [0.07]

      TensorFlow INPUT shapes (Numpy file ndarray shapes):
        input0: [n,128,128,3]
          mean: [1,1,1,3] -> [[[[0.485, 0.456, 0.406]]]]
          std : [1,1,1,3] -> [[[[0.229, 0.224, 0.225]]]]
        input1: [n,64,64]
          mean: [1,64] -> [[0.1, ..., 0.64]]
          std : [1,64] -> [[0.05, ..., 0.08]]
        input2: [n,5]
          mean: [1] -> [0.3]
          std : [1] -> [0.07]

        qcind=[
            ["input0","../input0.npy",[[[[0.485, 0.456, 0.406]]]],[[[[0.229, 0.224, 0.225]]]]],
            ["input1","./input1.npy",[0.1, ..., 0.64],[0.05, ..., 0.08]],
            ["input2","input2.npy",[0.3],[0.07]],
        ]

    input_output_quant_dtype: Optional[str]
      Input and Output dtypes when doing Full INT8 Quantization.
      "int8"(default) or "uint8"

    not_use_onnxsim: Optional[bool]
      No optimization by onnx-simplifier is performed.
      If this option is used, the probability of a conversion error is very high.

    not_use_opname_auto_generate: Optional[bool]
      Automatic generation of each OP name in the old format ONNX file
      and assignment of OP name are not performed.

    batch_size: Optional[int]
      Fixes the dynamic batch size to the specified numeric batch size.
      A value of 1 or more must be specified.

    overwrite_input_shape: Optional[List[str]]
      Overwrite the input shape.
      The format is
      ['i1:dim0,dim1,...,dimN' 'i2:dim0,dim1,...,dimN' 'i3:dim0,dim1,...,dimN']
      When there is only one input, for example,
      ['data:1,3,224,224']
      When there are multiple inputs, for example,
      ['data1:1,3,224,224','data2:1,3,112','data3:5']
      A value of 1 or more must be specified.
      Numerical values other than dynamic dimensions are ignored.
      Ignores batch_size if specified at the same time as batch_size.

    keep_ncw_or_nchw_or_ncdhw_input_names: Optional[List[str]]
      Holds the NCW or NCHW or NCDHW of the input shape for the specified INPUT OP names.
      If a nonexistent INPUT OP name is specified, it is ignored.
      Valid only for 3D, 4D and 5D input tensors.
      e.g.
      keep_ncw_or_nchw_or_ncdhw_input_names=['input0','input1','input2']

    keep_nwc_or_nhwc_or_ndhwc_input_names: Optional[List[str]]
      Holds the NWC or NHWC or NDHWC of the input shape for the specified INPUT OP names.
      If a nonexistent INPUT OP name is specified, it is ignored.
      If the input OP name is the same as the input OP name specified
      in the keep_ncw_or_nchw_or_ncdhw_input_names option, it is ignored.
      Valid only for 3D, 4D and 5D input tensors.
      e.g.
      keep_nwc_or_nhwc_or_ndhwc_input_names=['input0','input1','input2']

    keep_shape_absolutely_input_names: Optional[List[str]]
        Name of the INPUT that unconditionally maintains its shape.
        If a nonexistent INPUT OP name is specified, it is ignored.
        e.g.
        keep_shape_absolutely_input_names=['input0','input1','input2']

    output_names_to_interrupt_model_conversion: Optional[List[str]]
      Output names that interrupt model conversion.
      Interrupts model transformation at the specified output name
      and outputs the model partitioned into subgraphs.
      e.g.
      output_names_to_interrupt_model_conversion=['output0','output1','output2']

    disable_group_convolution: Optional[bool]
      Disable GroupConvolution and replace it with SeparableConvolution for
      output to saved_model format.

    enaable_batchmatmul_unfold: Optional[bool]
      BatchMatMul is separated batch by batch to generate a primitive MatMul.

    disable_suppression_flextranspose: Optional[bool]
      Disables FlexTranspose generation suppression.

    number_of_dimensions_after_flextranspose_compression: Optional[int]
      Number of Transpose OP dimensions generated after avoiding FlexTranspose generation.
      Default: 5

    optimization_for_gpu_delegate: Optional[bool]
        Replace operations that do not support gpu delegate with those
        that do as much as possible.

    replace_argmax_to_reducemax_and_indicies_is_int64: Optional[bool]
      Replace ArgMax with a ReduceMax. The returned indicies are int64.
      Only one of replace_argmax_to_reducemax_and_indicies_is_int64 and
      replace_argmax_to_reducemax_and_indicies_is_float32 and
      replace_argmax_to_fused_argmax_and_indicies_is_int64 and
      replace_argmax_to_fused_argmax_and_indicies_is_float32 can be specified.
      Default: False

    replace_argmax_to_reducemax_and_indicies_is_float32: Optional[bool]
      Replace ArgMax with a ReduceMax. The returned indicies are float32.
      Only one of replace_argmax_to_reducemax_and_indicies_is_int64 and
      replace_argmax_to_reducemax_and_indicies_is_float32 and
      replace_argmax_to_fused_argmax_and_indicies_is_int64 and
      replace_argmax_to_fused_argmax_and_indicies_is_float32 can be specified.
      Default: False

    replace_argmax_to_fused_argmax_and_indicies_is_int64: Optional[bool]
      Replace ArgMax with a ReduceMax. The returned indicies are int64.
      It improves inference speed at the cost of a small sacrifice in accuracy.
      See. https://github.com/tensorflow/models/tree/master/official/projects/edgetpu/vision#argmax-fusion-to-improve-segmentation-model-latency
      Currently, only 4D tensors are supported.
      Only one of replace_argmax_to_reducemax_and_indicies_is_int64 and
      replace_argmax_to_reducemax_and_indicies_is_float32 and
      replace_argmax_to_fused_argmax_and_indicies_is_int64 and
      replace_argmax_to_fused_argmax_and_indicies_is_float32 can be specified.
      Default: False

    replace_argmax_to_fused_argmax_and_indicies_is_float32: Optional[bool]
      Replace ArgMax with a ReduceMax. The returned indicies are float32.
      It improves inference speed at the cost of a small sacrifice in accuracy.
      See. https://github.com/tensorflow/models/tree/master/official/projects/edgetpu/vision#argmax-fusion-to-improve-segmentation-model-latency
      Currently, only 4D tensors are supported.
      Only one of replace_argmax_to_reducemax_and_indicies_is_int64 and
      replace_argmax_to_reducemax_and_indicies_is_float32 and
      replace_argmax_to_fused_argmax_and_indicies_is_int64 and
      replace_argmax_to_fused_argmax_and_indicies_is_float32 can be specified.
      Default: False

    fused_argmax_scale_ratio: Optional[float]
      For Fused ArgMax.
      Scale ratio when generating Fused ArgMax.
      0.0 < fused_argmax_scale_ratio <= 1.0
      Default: 0.5

    replace_asin_to_pseudo_asin: Optional[bool]
      Replace Asin with a pseudo Asin.

    replace_acos_to_pseudo_acos: Optional[bool]
      Replace Acos with a pseudo Acos.

    replace_acbs_to_pseudo_abs: Optional[bool]
      Replace Abs with a pseudo Abs.

    replace_prelu_to_pseudo_prelu: Optional[bool]
      Replace PReLU with a pseudo PReLU.

    replace_leakyrelu_to_pseudo_leakyrelu: Optional[bool]
      Replace LeakyReLU with a pseudo LeakyReLU.

    replace_power_to_pseudo_power: Optional[bool]
      Replace Power with a pseudo Power.

    replace_gathernd_to_pseudo_gathernd: Optional[bool]
      Replace GatherND with a pseudo GatherND.

    replace_neg_to_pseudo_neg: Optional[bool]
      Replace Neg with a pseudo Neg.

    replace_hardswish_to_pseudo_hardswish: Optional[bool]
      Replace HardSwish with a pseudo HardSwish.

    replace_erf_to_pseudo_erf: Optional[bool]
      Replace Erf with a pseudo Erf.

    mvn_epsilon: Optional[float]
      For MeanVarianceNormalization.
      The number to be added to the variance to avoid division by zero
      when normalizing the value.
      (input_tensor - mean) / tf.sqrt(variance + mvn_epsilon)
      Default: 0.0000000001

    param_replacement_file: Optional[str]
      Parameter replacement file path. (.json)

    check_gpu_delegate_compatibility: Optional[bool]
      Run TFLite ModelAnalyzer on the generated Float16 tflite model
      to check if the model can be supported by GPU Delegate.
      e.g.
      """
      === TFLite ModelAnalyzer ===

      Your TFLite model has '1' subgraph(s). In the subgraph description below,
      T# represents the Tensor numbers. For example, in Subgraph#0, the RESHAPE op takes
      tensor #0 and tensor #6 as input and produces tensor #7 as output.

      Subgraph#0 main(T#0) -> [T#17]
        Op#0 RESHAPE(T#0, T#6[2, 8, 8, 3, 2, ...]) -> [T#7]
        Op#1 SPLIT(T#5[0], T#7) -> [T#8, T#9]
        Op#2 RESHAPE(T#8, T#1[8, 8, 3, 2, 2]) -> [T#10]
        Op#3 TRANSPOSE(T#10, T#4[0, 3, 1, 4, 2]) -> [T#11]
        Op#4 RESHAPE(T#11, T#2[1, 8, 2, 8, 2, ...]) -> [T#12]
        Op#5 RESHAPE(T#9, T#1[8, 8, 3, 2, 2]) -> [T#13]
        Op#6 TRANSPOSE(T#13, T#4[0, 3, 1, 4, 2]) -> [T#14]
        Op#7 RESHAPE(T#14, T#2[1, 8, 2, 8, 2, ...]) -> [T#15]
        Op#8 CONCATENATION(T#12, T#15) -> [T#16]
        Op#9 RESHAPE(T#16, T#3[2, 16, 16, 3]) -> [T#17]

      Tensors of Subgraph#0
        T#0(inputs_0) shape:[2, 8, 8, 12], type:FLOAT32
        T#1(model/tf.compat.v1.squeeze_2/Squeeze) shape:[5], type:INT32 RO 20 bytes, data:[8, 8, 3, 2, 2]
        T#2(model/tf.expand_dims_1/ExpandDims) shape:[6], type:INT32 RO 24 bytes, data:[1, 8, 2, 8, 2, ...]
        T#3(model/tf.reshape_1/Reshape/shape) shape:[4], type:INT32 RO 16 bytes, data:[2, 16, 16, 3]
        T#4(model/tf.compat.v1.transpose/transpose/perm) shape:[5], type:INT32 RO 20 bytes, data:[0, 3, 1, 4, 2]
        T#5(model/tf.concat/concat/axis) shape:[], type:INT32 RO 4 bytes, data:[0]
        T#6(model/tf.reshape/Reshape/shape) shape:[6], type:INT32 RO 24 bytes, data:[2, 8, 8, 3, 2, ...]
        T#7(model/tf.reshape/Reshape) shape:[2, 8, 8, 3, 2, 2], type:FLOAT32
        T#8(model/tf.split/split) shape:[1, 8, 8, 3, 2, 2], type:FLOAT32
        T#9(model/tf.split/split1) shape:[1, 8, 8, 3, 2, 2], type:FLOAT32
        T#10(model/tf.compat.v1.squeeze_1/Squeeze) shape:[8, 8, 3, 2, 2], type:FLOAT32
        T#11(model/tf.compat.v1.transpose/transpose) shape:[8, 2, 8, 2, 3], type:FLOAT32
        T#12(model/tf.expand_dims/ExpandDims) shape:[1, 8, 2, 8, 2, 3], type:FLOAT32
        T#13(model/tf.compat.v1.squeeze_2/Squeeze1) shape:[8, 8, 3, 2, 2], type:FLOAT32
        T#14(model/tf.compat.v1.transpose_1/transpose) shape:[8, 2, 8, 2, 3], type:FLOAT32
        T#15(model/tf.expand_dims_1/ExpandDims1) shape:[1, 8, 2, 8, 2, 3], type:FLOAT32
        T#16(model/tf.concat/concat) shape:[2, 8, 2, 8, 2, 3], type:FLOAT32
        T#17(Identity) shape:[2, 16, 16, 3], type:FLOAT32

      Your model looks compatibile with GPU delegate with TFLite runtime version 2.10.0.
      But it doesn't guarantee that your model works well with GPU delegate.
      There could be some runtime incompatibililty happen.
      ---------------------------------------------------------------
                    Model size:       2988 bytes
          Non-data buffer size:       2757 bytes (92.27 %)
        Total data buffer size:        231 bytes (07.73 %)
          (Zero value buffers):          4 bytes (00.13 %)

      * Buffers of TFLite model are mostly used for constant tensors.
        And zero value buffers are buffers filled with zeros.
        Non-data buffers area are used to store operators, subgraphs and etc.
        You can find more details from https://github.com/tensorflow/tensorflow/blob/master/tensorflow/lite/schema/schema.fbs
      """

    check_onnx_tf_outputs_elementwise_close: Optional[bool]
        Returns "Matches" if the output of onnx and the output of TF are
        within acceptable proximity element by element.
        Returns "Unmatched" if the output of onnx and the output of TF are
        not within acceptable proximity element by element.
        If the output of onnx is 1D, it returns "Skipped" and skips the comparison
        between the output of onnx and that of TF. This is because when undefined
        dimensions are present, a situation often arises where very large index
        values are compared, causing OutOfMemory.
        Only the output content of the models final output OP is checked.

    check_onnx_tf_outputs_elementwise_close_full: Optional[bool]
        Returns "Matches" if the output of onnx and the output of TF are
        within acceptable proximity element by element.
        Check the output of all OPs in sequence from the beginning,
        including all but the final output OP of the model.
        Returns "Unmatched" if the output of onnx and the output of TF are
        not within acceptable proximity element by element.
        If the output of onnx is 1D, it returns "Skipped" and skips the comparison
        between the output of onnx and that of TF. This is because when undefined
        dimensions are present, a situation often arises where very large index
        values are compared, causing OutOfMemory.
        It is very time consuming because it performs as many inferences as
        there are operations.

    check_onnx_tf_outputs_sample_data_normalization: Optional[str]
        norm: Validate using random data normalized to the range 0.0 to 1.0
        denorm: Validate using random data in the range 0.0 to 255.0
        If there is a normalization layer at the models entry point, or
        if the model was trained on denormalized data, "denorm" must be specified.
        Default: "norm"

    check_onnx_tf_outputs_elementwise_close_rtol: Optional[float]
        The relative tolerance parameter.
        Default: 0.0

    check_onnx_tf_outputs_elementwise_close_atol: Optional[float]
        The absolute tolerance parameter.
        Default: 1e-4

    non_verbose: Optional[bool]
      Do not show all information logs. Only error logs are displayed.
      Default: False

    Returns
    ----------
    model: tf.keras.Model
      Model

Parameter replacement

This tool is used to convert NCW to NWC, NCHW to NHWC, NCDHW to NDHWC, NCDDHW to NDDHWC, NCDDDDDDHW to NDDDDDDHWC. Therefore, as stated in the Key Concepts, the conversion will inevitably break down at some point in the model. You need to look at the entire conversion log to see which OP transpositions are failing and correct them yourself. I dare to explain very little because I know that no matter how much detail I put in the README, you guys will not read it at all. attribute or INPUT constant or INPUT Initializer can be replaced with the specified value.

Starting from v1.3.0, almost all OPs except for some special OPs support pre- and post-transposition by pre_process_transpose and post_process_transpose.

  1. "A conversion error occurs."
  2. "Output results are wrong."

Please don't post such low level questions as issues.

  • convert option

    --param_replacement_file param_replacement.json
    
  • param_replacement.json

    {
      "format_version": 1,
      "operations": [
        {
          "op_name": "StatefulPartitionedCall/Tile_4",
          "param_target": "inputs", # attributes or inputs
          "param_name": "const_fold_opt__677",
          "values": [1,1,17] # Disable parameter transposition or overwrite parameters
        },
        {
          "op_name": "StatefulPartitionedCall/Cast_3",
          "param_target": "attributes", # attributes or inputs
          "param_name": "to",
          "values": 1 # Disable parameter transposition or overwrite "to" parameters
        },
        {
          "op_name": "Resize__697",
          "param_target": "inputs",
          "param_name": "Concat__696:0",
          "values": [26,26] # Replacement of unk__x (Resize OP, sizes height/width parameter)
        },
        {
          "op_name": "Transpose__927",
          "param_target": "attributes",
          "param_name": "perm",
          "values": [0,1,2,3] # Disable parameter transposition or overwrite "perm" parameters
        },
        {
          "op_name": "StatefulPartitionedCall/functional_1/max_unpooling2d_2/Reshape_1",
          "param_target": "inputs",
          "param_name": "const_fold_opt__911",
          "values": [4,131072] # Overwrite "shape" parameters
        },
        {
          "op_name": "Reshape_25",
          "param_target": "outputs",
          "param_name": "onnx::InstanceNormalization_270",
          "post_process_transpose_perm": [0,2,1] # Extrapolate 3D Transpose after Reshape
        },
        {
          "op_name": "Reshape_30",
          "param_target": "outputs",
          "param_name": "onnx::Mul_275",
          "post_process_transpose_perm": [0,2,3,1] # Extrapolate 4D Transpose after Reshape
        },
        {
          "op_name": "flatten_1127",
          "param_target": "inputs",
          "param_name": "dropout0",
          "pre_process_transpose_perm": [0,3,1,2]
        },
        {
          "op_name": "/Slice",
          "param_target": "op",
          "begin": [0,0,1,0],
          "end": [0,0,0,0],
          "end_mask": 15
        },
        {
          "op_name": "/Slice_1",
          "param_target": "op",
          "begin": [0,0,0,0],
          "end": [0,0,39,0],
          "end_mask": 11
        }
      ]
    }
  • Replacement Supported OPs

    No. OP type Remarks
    1 Add 1. "param_target": "inputs"
    pre_process_transpose_perm: Transpose is applied to the tensor before the Add operation with the perm specified as pre-processing.
    2. "param_target": "outputs"
    post_process_transpose_perm: Transpose is applied to the tensor after the Add operation with the perm specified as post-processing.
    2 Cast
    TypeValuesTypeValues
    float1610int83
    float321int165
    float6411int326
    bool9int647
    uint82
    uint164
    uint3212
    uint6413
    3 Concat 1. "param_target": "attributes"
    axis: Value of axis
    2. "param_target": "outputs"
    post_process_transpose_perm: Transpose is applied to the tensor after the Concat operation with the perm specified as post-processing.
    4 ConvTranspose ConvTranspose implements special replacements separately ignore all automatic conversions and generate tf.nn.conv1d_transpose or tf.nn.conv2d_transpose or tf.nn.conv3d_transpose directly by specifying all parameters.
    https://www.tensorflow.org/api_docs/python/tf/nn/conv1d_transpose
    https://www.tensorflow.org/api_docs/python/tf/nn/conv2d_transpose
    https://www.tensorflow.org/api_docs/python/tf/nn/conv3d_transpose
    1. "param_target": "op"
    output_shape: Value of output_shape
    strides: Value of strides
    padding: Value of padding
    dilations: Value of dilations
    5 Div 1. "param_target": "inputs"
    values: Value of input
    pre_process_transpose_perm: Transpose is applied to the tensor before the Div operation with the perm specified as pre-processing.
    2. "param_target": "outputs"
    post_process_transpose_perm: Transpose is applied to the tensor after the Div operation with the perm specified as post-processing.
    6 Expand 1. "param_target": "inputs"
    values: Value of shape
    pre_process_transpose_perm: Transpose is applied to the tensor before the Expand operation with the perm specified as pre-processing.
    2. "param_target": "outputs"
    post_process_transpose_perm: Transpose is applied to the tensor after the Expand operation with the perm specified as post-processing.
    7 Flatten 1. "param_target": "attributes"
    axis: Value of axis
    2. "param_target": "inputs"
    pre_process_transpose_perm: Transpose is applied to the tensor before the Flatten operation with the perm specified as pre-processing.
    3. "param_target": "outputs"
    post_process_transpose_perm: Transpose is applied to the tensor after the Flatten operation with the perm specified as post-processing.
    8 Gemm
    9 Gather 1. "param_target": "inputs"
    values: Value of indices
    pre_process_transpose_perm: Transpose is applied to the tensor before the Gather operation with the perm specified as pre-processing.
    2. "param_target": "outputs"
    post_process_transpose_perm: Transpose is applied to the tensor after the Gather operation with the perm specified as post-processing.
    10 MatMul 1. "param_target": "inputs"
    pre_process_transpose_perm: Transpose is applied to the tensor before the MatMul operation with the perm specified as pre-processing.
    2. "param_target": "outputs"
    post_process_transpose_perm: Transpose is applied to the tensor after the MatMul operation with the perm specified as post-processing.
    11 Mul 1. "param_target": "inputs"
    values: Value of input
    pre_process_transpose_perm: Transpose is applied to the tensor before the Mul operation with the perm specified as pre-processing.
    2. "param_target": "outputs"
    post_process_transpose_perm: Transpose is applied to the tensor after the Mul operation with the perm specified as post-processing.
    12 NonMaxSuppression
    13 ReduceL1
    ReduceL2
    ReduceLogSum
    ReduceLogSumExp
    ReduceMax
    ReduceMean
    ReduceMin
    ReduceProd
    ReduceSum
    ReduceSumSquare
    1. "param_target": "attributes"
    axes: Value of axes
    keepdims: Value of keepdims
    2. "param_target": "inputs"
    pre_process_transpose_perm: Transpose is applied to the tensor before the ReduceXX operation with the perm specified as pre-processing.
    3. "param_target": "outputs"
    post_process_transpose_perm: Transpose is applied to the tensor after the ReduceXX operation with the perm specified as post-processing.
    14 Unsqueeze 1. "param_target": "inputs"
    pre_process_transpose_perm: Transpose is applied to the tensor before the Unsqueeze operation with the perm specified as pre-processing.
    2. "param_target": "outputs"
    post_process_transpose_perm: Transpose is applied to the tensor after the Unsqueeze operation with the perm specified as post-processing.
    3. "param_target": "op"
    new_shape: Specifies directly the shape after Unsqueeze processing.
    15 Reshape 1. "param_target": "inputs"
    values: Value of shape
    pre_process_transpose_perm: Transpose is applied to the tensor before the Reshape operation with the perm specified as pre-processing.
    2. "param_target": "outputs"
    post_process_transpose_perm: Transpose is applied to the tensor after the Reshape operation with the perm specified as post-processing.
    16 Resize 1. "param_target": "attributes"
    coordinate_transformation_mode: Value of coordinate_transformation_mode
    extrapolation_value: Value of extrapolation_value
    mode: Value of mode
    2. "param_target": "inputs"
    values: Value of roi or scales or sizes. scales=[scale_h,scale_w],sizes=[h,w]
    pre_process_transpose_perm: Transpose is applied to the tensor before the Resize operation with the perm specified as pre-processing.
    3. "param_target": "outputs"
    post_process_transpose_perm: Transpose is applied to the tensor after the Resize operation with the perm specified as post-processing.
    17 Slice Slice implements special replacements separately ignore all automatic conversions and generate tf.strided_slice directly by specifying all parameters of tf.strided_slice directly.
    https://www.tensorflow.org/api_docs/python/tf/strided_slice
    See replace_slice.json for a sample description.
    20221221222956
    1. "param_target": "op"
    begin: Value of begin
    end: Value of end
    strides: Value of strides
    begin_mask: Value of begin_mask
    end_mask: Value of end_mask
    ellipsis_mask: Value of ellipsis_mask
    new_axis_mask: Value of new_axis_mask
    shrink_axis_mask: Value of shrink_axis_mask
    18 Softmax 1. "param_target": "attributes"
    axis: Value of axis. The transpositions corresponding to the specified axis are extrapolated before and after Softmax.
    2. "param_target": "inputs"
    values: Value of tensor
    19 Split 1. "param_target": "inputs"
    values: Value of split
    2. "param_target": "attributes"
    axis: Value of axis.
    num_outputs: Value of num_outputs.
    20 Sub 1. "param_target": "inputs"
    values: Value of input
    pre_process_transpose_perm: Transpose is applied to the tensor before the Sub operation with the perm specified as pre-processing.
    2. "param_target": "outputs"
    post_process_transpose_perm: Transpose is applied to the tensor after the Sub operation with the perm specified as post-processing.
    21 Tile 1. "param_target": "inputs"
    values: Value of input
    pre_process_transpose_perm: Transpose is applied to the tensor before the Tile operation with the perm specified as pre-processing.
    2. "param_target": "outputs"
    post_process_transpose_perm: Transpose is applied to the tensor after the Tile operation with the perm specified as post-processing.
    22 Transpose 1. "param_target": "attributes"
    perm: Value of perm
    2. "param_target": "inputs"
    values: Value of tensor

Supported layers

  • https://github.com/onnx/onnx/blob/main/docs/Operators.md
  • ✔️: Supported Help wanted: Pull Request are welcome
    OP Status
    Abs ✔️
    Acosh ✔️
    Acos ✔️
    Add ✔️
    And ✔️
    ArgMax ✔️
    ArgMin ✔️
    Asinh ✔️
    Asin ✔️
    Atanh ✔️
    Atan ✔️
    AveragePool ✔️
    BatchNormalization ✔️
    Bernoulli ✔️
    BitShift ✔️
    BitwiseAnd Help wanted
    BitwiseNot Help wanted
    BitwiseOr Help wanted
    BitwiseXor Help wanted
    Cast ✔️
    Ceil ✔️
    Celu ✔️
    CenterCropPad Help wanted
    Clip ✔️
    Col2Im Help wanted
    Compress ✔️
    ConcatFromSequence ✔️
    Concat ✔️
    ConstantOfShape ✔️
    Constant ✔️
    Conv ✔️
    ConvTranspose ✔️
    Cosh ✔️
    Cos ✔️
    CumSum ✔️
    DepthToSpace ✔️
    Det ✔️
    DequantizeLinear ✔️
    DFT Help wanted
    Div ✔️
    Dropout ✔️
    DynamicQuantizeLinear ✔️
    Einsum ✔️
    Elu ✔️
    Equal ✔️
    Erf ✔️
    Expand ✔️
    Exp ✔️
    EyeLike ✔️
    Flatten ✔️
    Floor ✔️
    FusedConv ✔️
    GatherElements ✔️
    GatherND ✔️
    Gather ✔️
    Gemm ✔️
    GlobalAveragePool ✔️
    GlobalLpPool ✔️
    GlobalMaxPool ✔️
    GreaterOrEqual ✔️
    Greater ✔️
    GridSample ✔️
    GroupNormalization Help wanted
    GRU Help wanted
    Hardmax ✔️
    HardSigmoid ✔️
    HardSwish ✔️
    Identity ✔️
    If ✔️
    Input ✔️
    InstanceNormalization ✔️
    Inverse ✔️
    IsInf ✔️
    IsNaN ✔️
    LayerNormalization ✔️
    LeakyRelu ✔️
    LessOrEqual ✔️
    Less ✔️
    Log ✔️
    LogSoftmax ✔️
    Loop Help wanted
    LpNormalization ✔️
    LRN ✔️
    LSTM Help wanted
    MatMul ✔️
    MatMulInteger ✔️
    MaxPool ✔️
    Max ✔️
    MaxRoiPool Help wanted
    MaxUnpool ✔️
    Mean ✔️
    MeanVarianceNormalization ✔️
    MelWeightMatrix Help wanted
    Min ✔️
    Mish ✔️
    Mod ✔️
    Mul ✔️
    Multinomial ✔️
    Neg ✔️
    NonMaxSuppression ✔️
    NonZero ✔️
    Optional Help wanted
    OptionalGetElement Help wanted
    OptionalHasElement Help wanted
    Not ✔️
    OneHot ✔️
    Or ✔️
    Pad ✔️
    Pow ✔️
    PRelu ✔️
    QLinearAdd ✔️
    QLinearConcat ✔️
    QLinearConv ✔️
    QLinearLeakyRelu ✔️
    QLinearMatMul ✔️
    QLinearMul ✔️
    QLinearSigmoid ✔️
    QLinearSoftmax ✔️
    QuantizeLinear ✔️
    RandomNormalLike ✔️
    RandomNormal ✔️
    RandomUniformLike ✔️
    RandomUniform ✔️
    Range ✔️
    Reciprocal ✔️
    ReduceL1 ✔️
    ReduceL2 ✔️
    ReduceLogSum ✔️
    ReduceLogSumExp ✔️
    ReduceMax ✔️
    ReduceMean ✔️
    ReduceMin ✔️
    ReduceProd ✔️
    ReduceSum ✔️
    ReduceSumSquare ✔️
    Relu ✔️
    Reshape ✔️
    Resize ✔️
    ReverseSequence ✔️
    RNN Help wanted
    RoiAlign ✔️
    Round ✔️
    Scatter ✔️
    ScatterElements ✔️
    ScatterND ✔️
    Scan Help wanted
    Selu ✔️
    SequenceAt ✔️
    SequenceConstruct ✔️
    SequenceEmpty ✔️
    SequenceErase ✔️
    SequenceInsert ✔️
    SequenceLength ✔️
    Shape ✔️
    Shrink ✔️
    Sigmoid ✔️
    Sign ✔️
    Sinh ✔️
    Sin ✔️
    Size ✔️
    Slice ✔️
    Softmax ✔️
    Softplus ✔️
    Softsign ✔️
    SpaceToDepth ✔️
    Split ✔️
    SplitToSequence ✔️
    Sqrt ✔️
    Squeeze ✔️
    STFT Help wanted
    StringNormalizer Help wanted
    Sub ✔️
    Sum ✔️
    Tanh ✔️
    Tan ✔️
    TfIdfVectorizer Help wanted
    ThresholdedRelu ✔️
    Tile ✔️
    TopK ✔️
    Transpose ✔️
    Trilu ✔️
    Unique ✔️
    Unsqueeze ✔️
    Upsample ✔️
    Where ✔️
    Xor ✔️

Generated Model

Validated model (without replacement.json)

ONNX file for testing. https://github.com/PINTO0309/onnx2tf/releases/tag/1.1.28

No. Model Pass
1 age_googlenet.onnx ✔️
2 alike_t_opset11_192x320.onnx ✔️
3 arcfaceresnet100-8.onnx ✔️
4 baseline_simplified.onnx ✔️
5 bvlcalexnet-12.onnx ✔️
6 caffenet-12.onnx ✔️
7 convtranspose_3_1_5_2.onnx ✔️
8 convtranspose_4_5_2_2.onnx ✔️
9 convtranspose_5_5_6_1.onnx ✔️
10 convtranspose_6_5_5_8.onnx ✔️
11 convtranspose_7_1_3_4.onnx ✔️
12 damoyolo_tinynasL20_T_192x192_post.onnx ✔️
13 densenet-12.onnx ✔️
14 depth_to_spase_17.onnx ✔️
15 digits.onnx ✔️
16 detr_demo.onnx ✔️
17 efficientformer_l1.onnx ✔️
18 efficientnet-lite4-11_nchw.onnx ✔️
19 effnet_opset11_dynamic_axis.onnx ✔️
20 emotion-ferplus-8_rename.onnx ✔️
21 face_detection_yunet_2022mar.onnx ✔️
22 face_recognition_sface_2021dec-act_int8-wt_int8-quantized.onnx ✔️
23 face_recognition_sface_2021dec.onnx ✔️
24 faster_rcnn-10.onnx ✔️
25 fastestdet.onnx ✔️
26 fused_conv_clip.onnx ✔️
27 fused_conv_hardsigmoid.onnx ✔️
28 fused_conv_leakyrelu.onnx ✔️
29 fused_conv_relu.onnx ✔️
30 fused_conv_sigmoid.onnx ✔️
31 fused_conv_tanh.onnx ✔️
32 gender_googlenet.onnx ✔️
33 handpose_estimation_mediapipe_2022may.onnx ✔️
34 iat_llie_180x320.onnx ✔️
35 if_p1_11.onnx ✔️
36 if_p2_11.onnx ✔️
37 if_p3_11.onnx ✔️
38 imageclassifier.onnx ✔️
39 inception-v2-9.onnx ✔️
40 inverse11.onnx ✔️
41 mnist-12.onnx ✔️
42 mobilenetv2-12.onnx ✔️
43 mosaic_11.onnx ✔️
44 mosaic-9.onnx ✔️
45 movenet_multipose_lightning_192x256_p6.onnx ✔️
46 nanodet-plus-m_416.onnx ✔️
47 object_tracking_dasiamrpn_kernel_cls1_2021nov.onnx ✔️
48 object_tracking_dasiamrpn_kernel_r1_2021nov.onnx ✔️
49 object_tracking_dasiamrpn_model_2021nov.onnx ✔️
50 pidnet_S_cityscapes_192x320.onnx ✔️
51 ppmattingv2_stdc1_human_480x640.onnx ✔️
52 qlinear_conv_tensor_test.onnx ✔️
53 rcnn-ilsvrc13-9.onnx ✔️
54 regnet_x_400mf.onnx ✔️
55 ResNet101-DUC-12.onnx ✔️
56 resnet18-v1-7.onnx ✔️
57 resnet50-v1-12.onnx ✔️
58 resnet50-v2-7.onnx ✔️
59 retinanet-9.onnx ✔️
60 sinet_320_op.onnx ✔️
61 squeezenet1.0-12.onnx ✔️
62 super-resolution-10.onnx ✔️
63 swinir-m_64x64_12.onnx ✔️
64 tinyyolov2-8.onnx ✔️
65 version-RFB-640.onnx ✔️
66 vit-b-32_textual.onnx ✔️
67 vit-b-32_visual.onnx ✔️
68 yolact_edge_mobilenetv2_550x550.onnx ✔️
69 yolact_regnetx_600mf_d2s_31classes_512x512.onnx ✔️
70 yolact_regnetx_800mf_20classes_512x512.onnx ✔️
71 yolo_free_nano_crowdhuman_192x320_post.onnx ✔️
72 yolov7_tiny_head_0.768_post_480x640.onnx ✔️
73 yolov8n.onnx ✔️
74 yolov8n-seg.onnx ✔️
75 yolox_nano_192x192.onnx ✔️
76 yolox_nano_416x416.onnx ✔️
77 yolox_s.onnx ✔️
78 yolox_x_crowdhuman_mot17_bytetrack.onnx ✔️
79 zero_dce_640_dele.onnx ✔️
80 zfnet512-12.onnx ✔️

Related tools

  1. tflite2tensorflow
  2. openvino2tensorflow
  3. tflite2json2tflite
  4. tensorflowjs_converter
  5. coremltools
  6. simple-onnx-processing-tools
  7. onnx-simplifier
  8. onnx_graphsurgeon
  9. onnx
  10. onnx-tensorflow
  11. onnx2keras

Acknowledgement

  1. onnx2tflite
  2. onnx-tensorflow
  3. https://github.com/onnx/models
  4. https://github.com/opencv/opencv_zoo

Contributors

Made with contrib.rocks.