kraiskil/onnx2c

Floating point exception during conversion to C

kgeeting opened this issue · 4 comments

Cool library. I tried converting one of my onnx models using the tool and was thrown a floating point exception error (zsh: floating point exception) during conversion. Apple clang version 15.0.0 (clang-1500.1.0.2.5). Maybe some unhandled div by 0 somewhere?

image

fullModel.onnx.zip

Thanks :)

I had a quick look at your model, and indeed, the exception is a division by zero. This stems from the first convolution layer /isi_encoder/conv1/Conv that is a 2D convolution, but the stride is given as [2]. onnx2c then iterprets this as a stride of [2,0], which of course is rather silly.

Now the onnx documentation states:

strides: int64[]
Stride along each spatial axis. If not present, the stride defaults is 1 along each spatial axis.

This reads to me like either "pad missing dimensions with ones" or "strides must be of correct dimensions or not given at all".
Whereas somehow a "stride of 2", like in this model, does sound more like 2 in each dimensions. Was this your intention?

There might be in onnx docs mentioned some rule about splatting attributes, but I can't find it right now.

I would provisionally say this is

  • a malformed input onnx model
  • either bad documentation of onnx or a missing, clearly nice to have feature
  • bug in onnx2c (rather report error than crash)
  • missing feature in onnx2c (splat or pad the strides)

Quick fix would be to try modify the model to have the attributes more explicitly encoded :)

Btw, just for posterity - which tool (and version) generated this .onnx file?

Sorry for the late reply. I reviewed the onnx model being fed in and your provisional guess was correct- the model was malformed, with stride discrepancies (as noted above) and different dimension sizes during a few tensor concatenations. Oddly neither of these issues were flagged when first converting the python model from PyTorch (v2.2) to onnx. But subsequent attempt to convert to C with your tool flagged them. :)

I've seen promising results when subsequently profiling the c models on an STM Nucleo board, and again just want to say, well done on the tool! I may look at the quantization alpha features you've got next. I do agree that maybe reporting the stride error (inspatialfilter.h) might be helpful in case people run into similar problems in the future. Cheers

Thanks for the followup.

There is a lot of rules in the onnx documentation, most of which onnx2c does not check. Just because it is a lot of lines of code... But it definitely should add a few checks for this kind of thing where something as popular as pytorch creates bad input.

Re the quantization - it is really alpha level. It has only even been used to quantize that one example with the AVR (https://github.com/kraiskil/onnx2c/tree/master/examples/atmega_mnist), and probably still has some parts of that project hard coded in the sources. I was thinking actually of removing that quantization feature completely...

I would strongly recommend trying out other quantizers out there first. When I wrote that quantization thing I found nothing that works or has a reasonable learning curve. But the field moves fast, and nowdays there seem to be options.

Added the strides check in the above commit.