precision-sustainable-ag/PhenoCV-WeedCam

Blob File Generation for 4 class model.

Opened this issue · 8 comments

For the 4 class model, I generated the .XML and .bin files successfully using the Model optimizer in the openvino toolkit. However, while generating the blob file, I am getting the following error.

Can't satisfy Data node location requirements for Stage node Add1_23332/Fused_Add_\n

The general workflow that I am following is

  1. Download tensorflow frozen model (.pb format)
  2. Convert it to an intermediate representation (IR) using Openvino toolbox's model optimizer. It generates 2 files -> .XML and .bin
  3. Use those file to generate a .blob file to be used with DepthAIs demo models which run in the platform. It needs 2 more json files which have output specifications based on the model backbone that is used (can be ResNet, can be MobileNet)

Testing this on Ubuntu 19.04 and the FFC cameras.

Semantic segmentation has not been tested on the camera. OAK-D should be initially tested as a NCS2 and run the models on the OpenVINO platform.

Brandon(Luxonis): Deeplabv3+ I’ve only seen online mention of issues with that on Myriad X. So I’m not thinking it will work easily. I think it needs a custom layer implementation through OpenCL.
So the way that Intel supports the Movidius Myriad X through OpenVINO by default is the Intel NCS2.
https://www.mouser.com/new/intel/intel-neural-compute-stick-2/?gclid=Cj0KCQjw6575BRCQARIsAMp-ksP2lRWXDlFtJqS9MUXRSubPQKsr7hSuO2Iim0qF50I2dr7wzi-iRfgaAiUBEALw_wcB
https://docs.openvinotoolkit.org/latest/openvino_docs_install_guides_installing_openvino_linux.html
https://docs.openvinotoolkit.org/latest/openvino_docs_install_guides_installing_openvino_raspbian.html
https://github.com/PINTO0309/PINTO_model_zoo

We will test this solution this week.

I don't have any experience with the NCS2, but related to your 3. point. Do you think it's caused by the use of xception-65 backbone network in the uploaded deeplabv3+ model? If so, you can try to convert a mobilenet-based one from here:
https://github.com/tensorflow/models/blob/master/research/deeplab/g3doc/model_zoo.md
It's trained for pascal-voc, but wouldn't matter in terms of getting the model up and running.

Does the 4 class model have an exception backbone? Also, I did try using one of the models trained on pascal which I think was with a Mobilenet backbone. But, it did not work out. It generated the blob successfully but when we used it with the depthai test script, (it only works with depth disabled) it does not show any segmentation maps.

Yes, both the 4-class and 3-class models are based on the xception-65 backbone described here: https://arxiv.org/abs/1802.02611

Excuse me if I'm being blunt :) But based on my personal experience it can often be (1) a missing mean-pixel-subtraction before inference, if that step is not included in the model (or double mean subtraction), (2) since the output mask is typically assigned pixel values starting from zero they appear just "black" or (3) reversed order of RGB channels in the input image.

If the pascal voc mobilenet-based deeplab model works, I'm sure we can train that one on the same data. That should speed up inference significantly as well.

Hi, so I tried to work on the updated models that are pushed this week. And the blob file is getting generated for both the models. I had a few doubts that I wanted to clear.

  1. I tried to visualize the graphs using Tensorboard and for the large model, it showed multiple (5) import subsystems in the graph, one of which has an Xception65 block in it and others are technically the same architectures. Am I missing something while visualising it on Tensorboard or is it something to do with the .pb file and importing the models to that and exporting the frozen graph again?
  2. I assume that the size for the .pb files used was 2049x2049 px, if I am not wrong, I used that size to convert it to a .blob file needed by depthai. I just wanted to know if it is possible to get a smaller sized model for inference without losing any points on how the model performs? As mentioned earlier, the deeplabv3_mnv3_decoder256 model (256x256) model works at around 11-15 fps on the system right now so the larger model will be much slower.
  3. After trying to run inference with the .blob file generated from the small model, I ran into a segmentation fault(core dump). Working on this issue right now.

Any pointers related to any of these issues will be appreciated :)

@skovsens This is the issue I told you, thanks for your help.

Hi, thank you for digging into it!
With regards to 1) that's very strange. I have pushed the non-frozen versions of the two mobilenet_v3 models - hopefully that can shed some light on it. As far as I can read here there shouldn't be any signs of xception65 in the trained model.

  1. I hoped that the input size limit didn't propagate through the conversion, but it seems I was wrong. You are right that it was exported with a maximum input size of 2049x2049. I have exported and pushed the same model in 256x256 as the maximum input size.

Depending on the field of view we get from the camera, I wonder how much we can see in a 256x256 pixel image - I look forward to finding out 😃

Cool that the pipeline is running smooth now! 😃
Models of 513x513 pixel input image are now uploaded. Depending on the image you test on, I would recommend cropping images, as opposed to scaling, to test the model performance (or a combination of the two). We can always train the model on lower resolution images to match the final the expected sampling distance (in terms of how large en area in the real world each pixel represents in the image).