UnboundLocalError: local variable 'output_model_json' referenced before assignment
prafulkl opened this issue · 9 comments
In the Olive/examples/whisper/
I did the
python -m pip install -r requirements.txt
python prepare_whisper_configs.py --model_name openai/whisper-tiny.en
python -m pip install librosa
python -m olive.workflows.run --config whisper_cpu_int8.json --setup
python -m olive.workflows.run --config whisper_cpu_int8.json 2> /dev/null
But when I run the
python test_transcription.py --config whisper_cpu_int8.json
First getting
from olive.model import ONNXModelHandler
ImportError: cannot import name 'ONNXModelHandler' from 'olive.model' (/home/kadellabs/anaconda3/envs/whisper_onnx/lib/python3.10/site-packages/olive/model/__init__.py)
I installed the olive-ai from the source
pip install git+https://github.com/microsoft/Olive
But this time getting
olive_model = ONNXModelHandler(**output_model_json["config"])
UnboundLocalError: local variable 'output_model_json' referenced before assignment
In my system, the models
directory is empty
I checked the name of the file in the models
>>> print(f"**/{config['engine']['output_name']}_{accelerator_spec}_model.json")
**/whisper_cpu_int8_cpu-cpu_model.json
But the models
folder does not have any files.
I run all the mentioned steps but not getting what the issue is
The models directory should not be empty. This probably means one or more of the optimization steps did not finish correctly. Did you run the python -m olive.workflows.run --config whisper_cpu_int8.json
part again after installing olive from source?
Please run the
python prepare_whisper_configs.py --model_name openai/whisper-tiny.en
python -m olive.workflows.run --config whisper_cpu_int8.json --setup
python -m olive.workflows.run --config whisper_cpu_int8.json
again without the 2> /dev/null
part. If it still fails, this should print the logs that you can share here.
I just wanted to let you know that now it works for the default file.
But again when I tried with another test file it gave me the
onnxruntime.capi.onnxruntime_pybind11_state.RuntimeException: [ONNXRuntimeError] : 6 : RUNTIME_EXCEPTION : Non-zero status code returned while running AudioDecoder node. Name:'AudioDecoder_1' Status Message: 2: [AudioDecoder]: only down-sampling supported.
The whisper model requires an audio file with 16000 as sample rate. The audio decoder has down sampling support, so it can also handle higher sample rates.
But audio files with lower sample rates are not valid which seems to be the case for you.
You either have to resample the audio and save it again.
Or try using an audio loading function like https://github.com/openai/whisper/blob/main/whisper/audio.py#L25 instead of https://github.com/microsoft/Olive/blob/main/examples/whisper/code/whisper_dataset.py#L40
I haven't tried the second one though.
The whisper model requires an audio file with 16000 as sample rate. The audio decoder has down sampling support, so it can also handle higher sample rates.
But audio files with lower sample rates are not valid which seems to be the case for you.
You either have to resample the audio and save it again.
Or try using an audio loading function like https://github.com/openai/whisper/blob/main/whisper/audio.py#L25 instead of https://github.com/microsoft/Olive/blob/main/examples/whisper/code/whisper_dataset.py#L40 I haven't tried the second one though.
This time I am getting this error:
`Traceback (most recent call last):
File "/home/user/anaconda3/envs/whisper_onnx/lib/python3.10/site-packages/whisper/audio.py", line 58, in load_audio
out = run(cmd, capture_output=True, check=True).stdout
File "/home/user/anaconda3/envs/whisper_onnx/lib/python3.10/subprocess.py", line 524, in run
raise CalledProcessError(retcode, process.args,
subprocess.CalledProcessError: Command '['ffmpeg', '-nostdin', '-threads', '0', '-i', PosixPath('/tmp/tmpb7umuvf3'), '-f', 's16le', '-ac', '1', '-acodec', 'pcm_s16le', '-ar', '16000', '-']' returned non-zero exit status 1.
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/home/user/Olive/examples/whisper/test_transcription.py", line 142, in <module>
output_text = main()
File "/home/user/Olive/examples/whisper/test_transcription.py", line 129, in main
audio = whisper.load_audio(temp_dir_path)
File "/home/user/anaconda3/envs/whisper_onnx/lib/python3.10/site-packages/whisper/audio.py", line 60, in load_audio
raise RuntimeError(f"Failed to load audio: {e.stderr.decode()}") from e
RuntimeError: Failed to load audio: ffmpeg version 4.4.4-0ubuntu1~22.04.sav1.1 Copyright (c) 2000-2023 the FFmpeg developers
built with gcc 11 (Ubuntu 11.4.0-1ubuntu1~22.04)
configuration: --prefix=/usr --extra-version='0ubuntu1~22.04.sav1.1' --toolchain=hardened --libdir=/usr/lib/x86_64-linux-gnu --incdir=/usr/include/x86_64-linux-gnu --arch=amd64 --enable-gpl --disable-stripping --enable-gnutls --enable-ladspa --enable-libaom --enable-libass --enable-libbluray --enable-libbs2b --enable-libcaca --enable-libcdio --enable-libcodec2 --enable-libflite --enable-libfontconfig --enable-libfreetype --enable-libfribidi --enable-libgme --enable-libgsm --enable-libjack --enable-libmp3lame --enable-libmysofa --enable-libopenjpeg --enable-libopenmpt --enable-libopus --enable-libpulse --enable-librabbitmq --enable-librubberband --enable-libshine --enable-libsnappy --enable-libsoxr --enable-libspeex --enable-libsrt --enable-libssh --enable-libtheora --enable-libtwolame --enable-libvidstab --enable-libvorbis --enable-libvpx --enable-libwebp --enable-libx265 --enable-libxml2 --enable-libxvid --enable-libzimg --enable-libzmq --enable-libzvbi --enable-lv2 --enable-omx --enable-openal --enable-opencl --enable-opengl --enable-sdl2 --enable-pocketsphinx --enable-librsvg --enable-libdc1394 --enable-libdrm --enable-libiec61883 --enable-chromaprint --enable-frei0r --enable-libx264 --enable-shared
libavutil 56. 70.100 / 56. 70.100
libavcodec 58.134.100 / 58.134.100
libavformat 58. 76.100 / 58. 76.100
libavdevice 58. 13.100 / 58. 13.100
libavfilter 7.110.100 / 7.110.100
libswscale 5. 9.100 / 5. 9.100
libswresample 3. 9.100 / 3. 9.100
libpostproc 55. 9.100 / 55. 9.100
/tmp/tmpb7umuvf3: Is a directory`
Try with 'tmp_audio_path' or 'args.audio_path'
'tmp_dir_path' is a directory which stores a copy of the audio file. But the whisper audio load expects a path to the file itself.
Okay the issue is resolved using the librosa resampling
audio, sr = librosa.load("data/test.mp3", sr=16000) # Resample to 16kHz
sf.write("data/resampled_test.wav", audio, sr)
But the model is only taking short chunks like 20-30 seconds, I want to transcribe the 60-minute audio files. If I divide those files into chunks then it will take same time as an original Whisper model!
I want to optimize the latency of the Whisper model
The whisper model is limited in the length of the audio it can process at once so the olive optimized model cannot resolve your long audio issue.
Yeah please close this thread