luxonis/depthai-ml-training

Converting a SavedModel Tensorflow Format to Luxonis Blob format

Closed this issue · 2 comments

This is a custom trained model on edgeimpulse.com which provides a Tensorflow SavedModel as well as a Tensorflow Lite Model at the end of the training. The issue is converting a Tensorflow SavedModel to a Luxonis Blob model by first Freezing the Tensorflow Saved Model and then trying to use https://blobconverter.luxonis.com/.

These are my files:
SavedModel file:
SavedModel.zip

Tflite file:
Tflite_file.zip

Sorry for opening this issue since it might have come up several times, but none of the solutions seemed to solve my issue. I've dropped the files in case it works for you'll. Sometimes it does convert, but doesn't work while running on OAK-D. For reference, there are 5 labels in the model and it is built on Mobilenet framework.

  1. Freezing a Tensorflow Saved Model. The method successfully freezes the SavedModel, however it fails while using https://blobconverter.luxonis.com/ to convert the Frozen format to a blob format.

Code to freeze the SavedModel:

import tensorflow as tf

# Load the saved model
loaded = tf.saved_model.load("/content/saved_model/")

# Extract the graph
graph = tf.function(loaded.signatures[tf.saved_model.DEFAULT_SERVING_SIGNATURE_DEF_KEY]).get_concrete_function(tf.TensorSpec(shape=[1, 320, 320, 3], dtype=tf.float32))
frozen_graph = graph.graph.as_graph_def()

# Save the frozen graph
with tf.io.gfile.GFile("/content/frozen_model.pb", "wb") as f:
    f.write(frozen_graph.SerializeToString())

Error I get with blobconverter:

[ ERROR ]  Cannot infer shapes or values for node "StatefulPartitionedCall".
[ ERROR ]  Expected DataType for argument 'dtype' not None.
[ ERROR ]  
[ ERROR ]  It can happen due to bug in custom shape infer function <function tf_native_tf_node_infer at 0x7f99c67e1af0>.
[ ERROR ]  Or because the node inputs have incorrect values/shapes.
[ ERROR ]  Or because input shapes are incorrect (embedded to the model or passed via --input_shape).
[ ERROR ]  Run Model Optimizer with --log_level=DEBUG for more information.
[ ERROR ]  Exception occurred during running replacer "REPLACEMENT_ID" (<class 'openvino.tools.mo.middle.PartialInfer.PartialInfer'>): Stopped shape/value propagation at "StatefulPartitionedCall" node. 
 For more information please refer to Model Optimizer FAQ, question #38. (https://docs.openvino.ai/latest/openvino_docs_MO_DG_prepare_model_Model_Optimizer_FAQ.html?question=38#question-38)

image

  1. Using a SavedModel -> IR Representation (OpenVINO) -> Blob conversion
    I used the standard OpenVINO instructions on the site to convert the SavedModel into a .xml and a .bin file.
    image

This is apparently what I used and it converted it successfully to .xml and .bin files, however the .xml file was just 2 Kb and the .bin file was 0 Kb
Output while conversion:

[ INFO ] The model was converted to IR v11, the latest model format that corresponds to the source DL framework input/output format. While IR v11 is backwards compatible with OpenVINO Inference Engine API v1.0, please use API v2.0 (as of 2022.1) to take advantage of the latest improvements in IR v11.
Find more information about API v2.0 and IR v11 at https://docs.openvino.ai/latest/openvino_2_0_transition_guide.html
[ SUCCESS ] Generated IR version 11 model.
[ SUCCESS ] XML file: /content/saved_model.xml
[ SUCCESS ] BIN file: /content/saved_model.bin

Nevertheless, I tried to convert to a blob file and it converted it to a 1Kb file and when I tried to run it, the NN size was displayed to be 3x320 which was absurd since it was trained on 320x320 images [1,320,320,3] format.

I reran the conversion again using:

!mo --input_shape [1,320,320,3] --saved_model_dir /content/saved_model/ --layout "ncwh->nhwc"

However, this time when I converted it to blob and used it while running, it gave me an error stating the bounding boxes contain x=1,y=0,w=0,h=0.

  1. This time I tried a .tflite to .onnx approach using https://github.com/zhenhuaw-me/tflite2onnx which wasn't successful in converting it to onnx in the first stage itself. ( A .tflite model is also included on the EdgeImpulse dashboard so I thought of trying this out)

Code:

import tflite2onnx

tflite_path = '/content/trained.tflite'
onnx_path = '/content/model.onnx'

tflite2onnx.convert(tflite_path, onnx_path)

Error:

---------------------------------------------------------------------------
NotImplementedError                       Traceback (most recent call last)
[<ipython-input-37-c193f27d3323>](https://localhost:8080/#) in <module>
      4 onnx_path = '/content/model.onnx'
      5 
----> 6 tflite2onnx.convert(tflite_path, onnx_path)

4 frames
[/usr/local/lib/python3.8/dist-packages/tflite2onnx/op/common.py](https://localhost:8080/#) in create(self, index)
    152             if opcode in tflite.BUILTIN_OPCODE2NAME:
    153                 name = tflite.opcode2name(opcode)
--> 154                 raise NotImplementedError("Unsupported TFLite OP: {} {}!".format(opcode, name))
    155             else:
    156                 raise ValueError("Opcode {} is not a TFLite builtin operator!".format(opcode))

NotImplementedError: Unsupported TFLite OP: 83 PACK!

Next, I tried using tf2onnx (https://github.com/onnx/tensorflow-onnx) using a SavedModel file
Input:

python -m tf2onnx.convert --saved-model /content/saved_model/ --output /content/model.onnx

Output:

2023-01-19 17:18:58,166 - WARNING - '--tag' not specified for saved_model. Using --tag serve
2023-01-19 17:19:12,548 - INFO - Signatures found in model: [serving_default].
2023-01-19 17:19:12,548 - WARNING - '--signature_def' not specified, using first signature: serving_default
2023-01-19 17:19:12,550 - INFO - Output names: ['output_0', 'output_1', 'output_2', 'output_3']
2023-01-19 17:19:15,690 - INFO - Using tensorflow=2.9.2, onnx=1.13.0, tf2onnx=1.13.0/2c1db5
2023-01-19 17:19:15,690 - INFO - Using opset <onnx, 13>
2023-01-19 17:19:15,694 - INFO - Computed 0 values for constant folding
2023-01-19 17:19:15,700 - INFO - Optimizing ONNX model
2023-01-19 17:19:15,715 - INFO - After optimization: Const -3 (4->1), Identity -1 (4->3)
2023-01-19 17:19:15,716 - INFO - 
2023-01-19 17:19:15,716 - INFO - Successfully converted TensorFlow model /content/saved_model/ to ONNX
2023-01-19 17:19:15,716 - INFO - Model inputs: ['input']
2023-01-19 17:19:15,716 - INFO - Model outputs: ['output_0', 'output_1', 'output_2', 'output_3']
2023-01-19 17:19:15,716 - INFO - ONNX model is saved at /content/model.onnx

But after that, while converting it to blob using blobconverter, this is the issue:

[ ERROR ]  Numbers of inputs and mean/scale values do not match. 
 For more information please refer to Model Optimizer FAQ, question #61. (https://docs.openvino.ai/latest/openvino_docs_MO_DG_prepare_model_Model_Optimizer_FAQ.html?question=61#question-61)

I'm completely stuck and have tried everything I could and hoping there is some workaround. Any workaround on converting this to blob would be really helpful since I'm completing the project under a narrow time constraint.

Thanks a lot. Your help is really appreciated!
Dhruv Sheth

Hi @dhruvsheth-ai ,
What numbers did you provide for mean/scale values (when running model optimized - mo.py)?
Thanks, Erik

Sorry for the trouble @Erol444. I later used one of the tools created by PINTO to successfully convert it. Referencing the solution for others who are interested: PINTO0309/PINTO_model_zoo#323

Thanks!