onnxruntime::BroadcastIterator::Append(ptrdiff_t, ptrdiff_t) axis == 1 || axis == largest was false. Attempting to broadcast an axis by a dimension other than 1. 211 by 421
QMZ321 opened this issue Β· 14 comments
when I try to use inference demo, I got an error
import librosa
from espnet_onnx import Speech2Text
speech2text = Speech2Text(model_dir='/root/autodl-tmp/.cache/espnet_onnx/librispeech_100-asr-conformer-aed')
wav_file = '../wav_test/121-121726-0000.wav'
y, sr = librosa.load(wav_file, sr=16000)
nbest = speech2text(y)
error info
/root/miniconda3/envs/espnet-onnx/lib/python3.8/site-packages/espnet_onnx/utils/abs_model.py:63: UserWarning: Inference will be executed on the CPU. Please provide gpu providers. Read How to use GPU on espnet_onnx
in readme in detail.
warnings.warn(
2023-04-15 19:34:26.289677552 [E:onnxruntime:, sequential_executor.cc:368 Execute] Non-zero status code returned while running Add node. Name:'Add_398' Status Message: /hdd/doc/onnxruntime/onnxruntime/core/providers/cpu/math/element_wise_ops.h:523 void onnxruntime::BroadcastIterator::Append(ptrdiff_t, ptrdiff_t) axis == 1 || axis == largest was false. Attempting to broadcast an axis by a dimension other than 1. 211 by 421
Traceback (most recent call last):
File "inference.py", line 9, in
nbest = speech2text(y)
File "/root/miniconda3/envs/espnet-onnx/lib/python3.8/site-packages/espnet_onnx/asr/asr_model.py", line 79, in call
enc, _ = self.encoder(speech=speech, speech_length=lengths)
File "/root/miniconda3/envs/espnet-onnx/lib/python3.8/site-packages/espnet_onnx/asr/model/encoders/encoder.py", line 70, in call
self.forward_encoder(feats, feat_length)
File "/root/miniconda3/envs/espnet-onnx/lib/python3.8/site-packages/espnet_onnx/asr/model/encoders/encoder.py", line 87, in forward_encoder
self.encoder.run(["encoder_out", "encoder_out_lens"], {
File "/root/miniconda3/envs/espnet-onnx/lib/python3.8/site-packages/onnxruntime/capi/onnxruntime_inference_collection.py", line 192, in run
return self._sess.run(output_names, input_feed, run_options)
onnxruntime.capi.onnxruntime_pybind11_state.RuntimeException: [ONNXRuntimeError] : 6 : RUNTIME_EXCEPTION : Non-zero status code returned while running Add node. Name:'Add_398' Status Message: /hdd/doc/onnxruntime/onnxruntime/core/providers/cpu/math/element_wise_ops.h:523 void onnxruntime::BroadcastIterator::Append(ptrdiff_t, ptrdiff_t) axis == 1 || axis == largest was false. Attempting to broadcast an axis by a dimension other than 1. 211 by 421
please help me. thank you very much!
I found that it was not the problem of wav file loading, because when I switched to the timit dataset in the demo, for the same wav file, timit could work normally.
I guess there is a problem with the export of librispeech's conformer model, please help meοΌ
After trying to load 'kamo-naoyuki/timit_asr_train_asr_raw_word_valid.acc.ave' and 'kamo-naoyuki/librispeech_asr_train_asr_conformer5_raw_bpe5000_frontend_confn_fft400_frontend_confhop_length160_scheduler_confwarmup_steps25000_batch_bins14000 After 0000_optim_conflr0.0015_initnone_sp_valid.acc.ave' from espnet_model_zoo , I guessed the problem was with espnet-onnx export or export_from_zip, because when I ues from_pretrained, That's not going to happen
Hi @QMZ321, thank you for reporting your issue.
Am I right that you could successfully export the model kamo-naoyuki/timit_asr_train_asr_...
with export_from_pretrained
method? Basically, export_from_pretrained
and export_from_zip
do the same process, so I don't think we have an issue with export_from_zip
if export_from_pretrained
works.
The issue happens with the Add_398
node, so it might be an issue with positional encoding. If possible, would you install netron
and visualize the model structure around Add_398
for further debugging? We need to figure out exactly what operation is causing this problem.(You can use Ctrl + F
to search the node in netron)
Hi @Masao-Someki, thank you for your reply!
This is the result of the netron
visualization.
Thank you, it seems that the sequence length is different in the following part. I think the sequence length of matrix_ac
is 211 while matrix_bd
is 421.
Would you check your rel_shift version? It seems that the model uses the latest
version of relative position embedding while legacy_rel_shift
is used during attention calculation.
espnet_onnx/espnet_onnx/export/asr/models/multihead_att.py
Lines 86 to 96 in c074393
@Masao-Someki Sorry, I'm a newbie and don't know how to check the rel_shift version
. This is the config
I used during espnet training, I don't know if it can help.
This is the config
I used
https://github.com/espnet/espnet/blob/master/egs2/librispeech_100/asr1/conf/tuning/train_asr_conformer_lr2e-3_warmup15k_amp_nondeterministic.yaml
@QMZ321
It seems that the configuration is correct. Just for clarification, which version of PyTorch, onnx, onnxruntime and espnet_onnx do you use?
If you installed espnet_onnx via pip, then would you clone this repository and check if the issue still happens with the latest script?
It seems that the configuration is correct. Just for clarification, which version of PyTorch, onnx, onnxruntime and espnet_onnx do you use?
my version is: pytorch=1.13.1
, onnx=1.11.0
, onnxruntime=1.11.1.espnet
, espnet_onnx=0.1.10
If you installed espnet_onnx via pip, then would you clone this repository and check if the issue still happens with the latest script?
After I cloned this repository, I run the command python setup.py install
.
I try again, but still the same problem.
@QMZ321
Would you check the type of class for position embedding and multihead attention?
In that configuration, the position embedding should be the RelPositionalEncoding
class, and attention should be the RelPositionMultiHeadedAttention
class.
I think this issue happens with the LegacyRelPositionMultiHeadedAttention
class.
@Masao-Someki
For debug, I extracted the train part separately:
I inserted a breakpoint to see the type of rel_pos_type:
The result is as follows:
@Masao-Someki
I found that as long as the optimize option
is disabled,
from espnet_onnx.export import ASRModelExport
m = ASRModelExport()
# m = ASRModelExport('/root/autodl-tmp/.cache/espnet_onnx')
m.export_from_zip(
'/root/autodl-tmp/projects/espnet_onnx_project/model/espnet/asr_train_asr_conformer_lr2e-3_warmup15k_amp_nondeterministic_raw_en_bpe5000_sp_valid.acc.ave.zip',
tag_name='librispeech_100-asr-conformer-aed',
quantize=True,
# optimize=True
)
it works fine.
But I think the optimize option
is very necessary for me, please help me.
@Masao-Someki
Maybe it's a problem with custom onnxruntime
? Because the optimize option
is related to custom onnxruntime
.
And I'm not sure if my custom onnxruntime
is correct, because the link below here
doesn't work.
instead I used the latest
version of custom onnxruntime
in releases
@QMZ321 Sorry for the late reply.
Would you set use_ort_for_espnet
configuration to True, if not set it?
from espnet_onnx.export import ASRModelExport
m = ASRModelExport()
m.set_export_config(
max_seq_len=5000,
use_ort_for_espnet=True,
)
m.export_from_pretrained(tag_name, quantize=False, optimize=True)
If this fixes your issue, then I will modify some documents to mention this configuration...
@Masao-Someki
Thank you for your reply!
Through your method, I successfully solved the problem!
Thank you very much!
π π π :)