ai-techsystems/deepC

A complete working example from Converting a pytorch model to .cpp to deploy it on Arduino

mohaimenz opened this issue · 15 comments

Hi,
I am kind of struggling to find the steps to deploy my .ccp to arduino.
I have a practical model that I want to deploy in Arduino Nano 33 BLE Sense.
I was going through the following page and it finishes with converting the model to c++.
https://cainvas.ai-tech.systems/use-cases/sign-language-sensor-app/

Can anybody provide/direct me to the next steps?

Kind Regards,
Mohaimen

Thank you so much for filing the issue. We will look at it and take appropriate action as soon as possible.' first issue

Check out the file in your cainvas workspace ./asl_model_deepC/asl_model.cpp

Hi,
I am looking for the the steps required to deploy the .cpp file that I generate using deepC. This part is missing. I am unable to find the steps. Could you please provide me something like the page I referenced in my first comment where you have described the steps upto creating the .cpp file? It would be greatly appreciated.

I'm also looking for information on how to do this. I've followed as much as I could, but can't find enough to get me through to the end. I've used the following links:

I'm going to try going the ONNX->TFLite approach now, but I'd prefer to use your library.

Hi,
The one very important piece of information is missing every where that is "How do I convert my Post Training Quantized model using deepC?" I do not see any documentation anywhere about it. It seems that deepC completely depends on ONNX. It can't directly handle a pytorch model. If that is the case, ONNX currently does not support quantized models. Thus, deepC can't help you to deploy your quantized pytorch model to microcontroller (Arduino). I still can't believe this. Am I missing something?

Onnx supports quantization in the network, check out quantization/de-quantization operators

  1. https://github.com/onnx/onnx/blob/master/docs/Operators.md#QuantizeLinear
  2. https://github.com/onnx/onnx/blob/master/docs/Operators.md#DynamicQuantizeLinear

AITS deepC compiler supports quantization as a separate step too, it has not been open sourced yet.

Also, here is helpful article on how to quantize a network using PyTorch - https://pytorch.org/tutorials/advanced/static_quantization_tutorial.html#post-training-static-quantization

@srohit0 That is quantization by onnx. So, you can't quantize using pytorch and then export to onnx. Pytorch post training quantization give me a prediction accuracy of 77.5 where as when I quantize using onnx, it gives me only 60% accuracy which is too low. Anyway, it seems that there is nothing of-the-shelf thing to deploy a complex and relatively bigger model to MCUs.

Please support your assertions with example notebooks to clarify what you're looking for.

Notate, what you can and can not export after quantization in pyTorch and ONNX.

@srohit0 I can definitely do that. I am just writing my paper and at the same time working on the deployment of my model. Struggling a lot here. ONNX team already said in so many places that it does not support quantized model however, you can always quantize using onnx. Anyway, right now I can't share my codes as well as my model. But I will definitely come back with example works once I submit my paper. I am still working with deepC. Just taking an extremely compressed model 20MB to only 266KB, converted into onnx. Now will attempt to use deepC to deploy it in Spresense instead Arduino Nano 33 BLE sense as it has only 256KB SRAM (spresense has 1.5MB).

I'm also very much interested in doing this.
I've successfully deployed a custom TensorFlow model (converted into tflite) on a Nano 33 BLE, but I'd like to do the same with a new pytorch model. So far, I've not found tflite equivalent for pytorch or onnx.

I've got high hopes with DeepC and would like any reference documentation that anyone can point me to.
@mohaimenz, if you've got some code that is shareable, I might be able to take a stab and share any successes I might have with it.

Would love to help you @dionator and @mohaimenz

You can bring pytorch model to cainvas and use deepC-compiler to get the working model library for Nano 33 BLE.

You can reach out for further support, if case you need it.

@srohit0 Arduino Nano 33 BLE has 1MB Flash and 256KB RAM. My model size is around 500KB, Input size is around 250KB. After 8-bit quantization, the model size becomes 160KB which is non issue as I have 1MB Flash. The issue is that the arena size required to run the model using TF Lite Micro on the Nano is 303KB.

The bottom line is that deepC works with onnx model which does not support a quantized pytorch model for general purpose except for caffe. I tried using onnx static quantization for my model. However, my model suffers a 37% accuracy loss where as pytorch static quantization produces a model that has only a 4% accuracy loss.

Now, we are using TF Lite that produces the quantized model that has 15% accuracy loss, however, this was only I could do this time. So, I ended up deploying my model to a relatively larger device from Sony.

We are still keeping an eye on deepC if we can find any alternative for our model.
I hope my reply was able to provide clear information. For now, I am unable to share my implementation as I am submitting my work to a conference but I am happy to discuss further if anyone interested on it.

PyTorch supports quantization.

Expecting us to solve your problem without providing complete information is a non-starter. We're rooting for your success nonetheless.

@srohit0 yes, I have mentioned that in the second paragraph last sentence.
Well, I am not expecting anybody to solve my problem. I was just trying to know the info and once I found that:
deepC does not directly handle PyTorch model rather it works with ONNX.
ONNX does not support quantized pytorch model.
ONNX quantization affects my model accuracy in huge margin.
I just stopped saying anything. When you mentioned me, I thought I would share those information and did so.