ERROR("Non-constant 'constant_value' input to Pad would result in dynamic memory allocation")
henkten opened this issue · 15 comments
Hello, first of all thank you for this great repository.
I have the following error when I try to export my onnx model to c using onnx2c.
onnx2c/src/nodes/pad.cc, line 45
Unfortunately I am a beginner in C and have no idea how to fix this error.
Here is an overview of the onnx model (exported from pytorch).
Does anyone know a way to fix this error or can help me.
I will be happy to answer any questions about the model or my setup.
Thank you very much in advance!
With kind regards
Henk
Hi Henk,
I looked at the onnx2c code around the pad operator, and found this comment, which suggest this is a bug in onnx2c.
The constant padding value should be usable even if not constant, but I guess the printed code does not yet know how to handle such a case.
A quick work-around, if applicable in your model, is to not give the constant value input to the Pad node when generating the network. Onnx2c would then use the default of 0 - but this of course might not be what you need?
Hi,
thank you very much for your quick reply.
I have checked the constant_value = inputs[2]
for my pad nodes. The image shows that the constant_value is an empty string. I suspect that is the error.
I deleted the constant _value from all pad nodes and tested the model a second time using onnx2c.
Now the code runs through this section without error.
Unfortunately, I have a new error for the cast nodes. For this error, I have no idea what to do.
Here is a picture of the cast node. Maybe you have an idea how to fix this problem?
Thank you very much in advance!
With kind regards
Henk
The constant_value
can be a string as per the ONNX specification, though before ONNX opset version 13, it had to be an integer. This looks like a case where onnx2c hasn't kept up with the development of the ONNX specification.
For the cast, I think this is a duplicate of issue #26? I'm looking at the above graph, and but I can't spot a Cast
node there?
I unfortunately do not understand from the issue #26 what I can change, do you have a suggestion how I can work around the error?
Sorry, i split the diagram of the simplified net above. Here is the diagram of the original mesh.
The problem with onnx2c Cast
implementation is the switch starting on line
Line 28 in bdfe0f7
Casts to integers never got implemented because
Cast
really doesn't sound like a useful operand on when targeting embedded systems - such a node should be optimized away. Maybe a tool like onnxsim could help fold away such nodes? Also, looking at the graph, the Dropout
sounds like a pure training node - so one that is not implemented in onnx2c.With the created model from onnxsim i got this error. Here a picture from the pad node. The model does not have cast and dropout nodes anymore.
This is curious. Either your model is doing something really strange with the constant_value
input tensor, or onnx2c handles this input incorrectly.
From the image I see the constant_value
has no name, and mode: constant
means constant_value
should be a scalar, so I would lean towards an unimplemented feature in onnx2c. Would it be possible to post the simplified model to debug this?
A side note, I just had a look, and ONNX is having very few (1) backend test for the Pad operand. So onnx2c being non-conformant is pretty likely.
Sure.
here a pic from the simplified onnx-model.
For a better understanding of the creation of the onnx model.
The onnx model is an export of a Temporal Convolutional Network from Pytorch. Maybe that is where the error of the missing name for constant_value
comes from.
I can upload or send you the model if you want a better overview and understanding of the modelparameter.
Yes, please upload. Having the .onnx file allows to run onnx2c under the debugger, and this helps to understand the problem.
I have uploaded the original Onnx, a modified Onnx with changed input and output names, and a simplified version of the modified model.
Additionally I uploaded the script I used to modify the original model.
Here is the link to the repo.
Thanks for the files. I ran the tcn_model_mod_sim.onnx
one to have a look at what is going on.
The problem is with the 2nd input to the Pad
node (at least the /res_blocks.1/Pad_1
one). This pads
input is of the format float32[64], which is not according to ONNX specifications (https://github.com/onnx/onnx/blob/main/docs/Operators.md#Pad). pads
are supposed to tell how much to pad with. A tensor of floats sounds more like the contents of the added cells in the output padded tensor.
pads (non-differentiable) : tensor(int64)
Tensor of integers indicating the number of padding elements to add or remove (if negative) at the beginning and end of each axis. For 2D input tensor, it is the number of pixels.pads
should be a 1D tensor of shape [2 * num_axes] wherenum_axes
refers to the number of elements in theaxes
input or the input rank ifaxes
are not provided explicitly.pads
format should be: [x1_begin, x2_begin, ..., x1_end, x2_end,...], where xi_begin is the number of pad values added at the beginning of axisaxes[i]
and xi_end, the number of pad values added at the end of axisaxes[i]
.
Seeing such output sounds like a bug in onnxsim. I tried to run onnxsim on the input tcn_model.onnx
, but that would not work, since onnxsim couldn't handle the Reshape node being of too new version (i.e. it has an 'allowzero' attribute as of ONNX-14)
Latest onnx2c version now contains at least a more explicit error message on this kind of input :)
Maybe another approach here is to look back, if padding really is necessary? In such a small network, would it be feasible to ignore the edges of the input? I would guess it would result in faster inference times.
Thank you for your time and investigation for my model.
This is not the original model which I use final, but a simple replica to work with it quickly and easily.
Unfortunately, padding is required because the output length must be equal to the input length.
Did I understand it correctly then that it is not possible to manually adjust the padding operations to cancel the errors before I try to convert the model with onn2c?
Did I understand it correctly then that it is not possible to manually adjust the padding operations to cancel the errors before I try to convert the model with onn2c?
If by this you mean manually editing the .onnx file? I don't know of such a tool, but I would imagine there would be something out there to make this possible.
The problem here is the original (pytorch?) export to .onnx is creating a long chain of operations just to end up as the pads
input to the Pad
operand. (i.e. the ConstantOfShape, Concat, Reshape, Slice..., Cast). All this runtime calculation just to achieve what needs to be a compile time constant (i.e. onnx2c should recognize this pads
tensor as a constant, otherwise the model has inference time dynamic memory allocations).
onnxsim
should be able to do this simplification. But what I understand is the output of onnxsim in your repository is invalid, so seems you found an onnxsim bug. For some strange reason, I can't even re-run onnxsim to verify this.
It does sound strange that if you in pytorch provide a constant padding tensor to the Pad
operand, that it would create such a long chain of operands to achieve this. Maybe there is a way of simplifying the .onnx at export time?
Hi, thanks for the reply, yes the onnx model is a pytorch export.
I will have a look at the export. Maybe I can find a solution for the pad node there, or a simplification of the export.