microsoft/Olive

UnboundLocalError: local variable 'output_model_json' referenced before assignment

prafulkl opened this issue · 9 comments

In the Olive/examples/whisper/
I did the

python -m pip install -r requirements.txt
python prepare_whisper_configs.py --model_name openai/whisper-tiny.en
python -m pip install librosa
python -m olive.workflows.run --config whisper_cpu_int8.json --setup
python -m olive.workflows.run --config whisper_cpu_int8.json 2> /dev/null

But when I run the
python test_transcription.py --config whisper_cpu_int8.json
First getting

    from olive.model import ONNXModelHandler
ImportError: cannot import name 'ONNXModelHandler' from 'olive.model' (/home/kadellabs/anaconda3/envs/whisper_onnx/lib/python3.10/site-packages/olive/model/__init__.py)

I installed the olive-ai from the source
pip install git+https://github.com/microsoft/Olive

But this time getting

    olive_model = ONNXModelHandler(**output_model_json["config"])
UnboundLocalError: local variable 'output_model_json' referenced before assignment

In my system, the models directory is empty
I checked the name of the file in the models
>>> print(f"**/{config['engine']['output_name']}_{accelerator_spec}_model.json")
**/whisper_cpu_int8_cpu-cpu_model.json

But the models folder does not have any files.

I run all the mentioned steps but not getting what the issue is

The models directory should not be empty. This probably means one or more of the optimization steps did not finish correctly. Did you run the python -m olive.workflows.run --config whisper_cpu_int8.json part again after installing olive from source?

Please run the

python prepare_whisper_configs.py --model_name openai/whisper-tiny.en
python -m olive.workflows.run --config whisper_cpu_int8.json --setup
python -m olive.workflows.run --config whisper_cpu_int8.json

again without the 2> /dev/null part. If it still fails, this should print the logs that you can share here.

I just wanted to let you know that now it works for the default file.

But again when I tried with another test file it gave me the
onnxruntime.capi.onnxruntime_pybind11_state.RuntimeException: [ONNXRuntimeError] : 6 : RUNTIME_EXCEPTION : Non-zero status code returned while running AudioDecoder node. Name:'AudioDecoder_1' Status Message: 2: [AudioDecoder]: only down-sampling supported.

The whisper model requires an audio file with 16000 as sample rate. The audio decoder has down sampling support, so it can also handle higher sample rates.

But audio files with lower sample rates are not valid which seems to be the case for you.

You either have to resample the audio and save it again.

Or try using an audio loading function like https://github.com/openai/whisper/blob/main/whisper/audio.py#L25 instead of https://github.com/microsoft/Olive/blob/main/examples/whisper/code/whisper_dataset.py#L40
I haven't tried the second one though.

The whisper model requires an audio file with 16000 as sample rate. The audio decoder has down sampling support, so it can also handle higher sample rates.

But audio files with lower sample rates are not valid which seems to be the case for you.

You either have to resample the audio and save it again.

Or try using an audio loading function like https://github.com/openai/whisper/blob/main/whisper/audio.py#L25 instead of https://github.com/microsoft/Olive/blob/main/examples/whisper/code/whisper_dataset.py#L40 I haven't tried the second one though.

This time I am getting this error:

`Traceback (most recent call last):
  File "/home/user/anaconda3/envs/whisper_onnx/lib/python3.10/site-packages/whisper/audio.py", line 58, in load_audio
    out = run(cmd, capture_output=True, check=True).stdout
  File "/home/user/anaconda3/envs/whisper_onnx/lib/python3.10/subprocess.py", line 524, in run
    raise CalledProcessError(retcode, process.args,
subprocess.CalledProcessError: Command '['ffmpeg', '-nostdin', '-threads', '0', '-i', PosixPath('/tmp/tmpb7umuvf3'), '-f', 's16le', '-ac', '1', '-acodec', 'pcm_s16le', '-ar', '16000', '-']' returned non-zero exit status 1.

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/home/user/Olive/examples/whisper/test_transcription.py", line 142, in <module>
    output_text = main()
  File "/home/user/Olive/examples/whisper/test_transcription.py", line 129, in main
    audio = whisper.load_audio(temp_dir_path)
  File "/home/user/anaconda3/envs/whisper_onnx/lib/python3.10/site-packages/whisper/audio.py", line 60, in load_audio
    raise RuntimeError(f"Failed to load audio: {e.stderr.decode()}") from e
RuntimeError: Failed to load audio: ffmpeg version 4.4.4-0ubuntu1~22.04.sav1.1 Copyright (c) 2000-2023 the FFmpeg developers
  built with gcc 11 (Ubuntu 11.4.0-1ubuntu1~22.04)
  configuration: --prefix=/usr --extra-version='0ubuntu1~22.04.sav1.1' --toolchain=hardened --libdir=/usr/lib/x86_64-linux-gnu --incdir=/usr/include/x86_64-linux-gnu --arch=amd64 --enable-gpl --disable-stripping --enable-gnutls --enable-ladspa --enable-libaom --enable-libass --enable-libbluray --enable-libbs2b --enable-libcaca --enable-libcdio --enable-libcodec2 --enable-libflite --enable-libfontconfig --enable-libfreetype --enable-libfribidi --enable-libgme --enable-libgsm --enable-libjack --enable-libmp3lame --enable-libmysofa --enable-libopenjpeg --enable-libopenmpt --enable-libopus --enable-libpulse --enable-librabbitmq --enable-librubberband --enable-libshine --enable-libsnappy --enable-libsoxr --enable-libspeex --enable-libsrt --enable-libssh --enable-libtheora --enable-libtwolame --enable-libvidstab --enable-libvorbis --enable-libvpx --enable-libwebp --enable-libx265 --enable-libxml2 --enable-libxvid --enable-libzimg --enable-libzmq --enable-libzvbi --enable-lv2 --enable-omx --enable-openal --enable-opencl --enable-opengl --enable-sdl2 --enable-pocketsphinx --enable-librsvg --enable-libdc1394 --enable-libdrm --enable-libiec61883 --enable-chromaprint --enable-frei0r --enable-libx264 --enable-shared
  libavutil      56. 70.100 / 56. 70.100
  libavcodec     58.134.100 / 58.134.100
  libavformat    58. 76.100 / 58. 76.100
  libavdevice    58. 13.100 / 58. 13.100
  libavfilter     7.110.100 /  7.110.100
  libswscale      5.  9.100 /  5.  9.100
  libswresample   3.  9.100 /  3.  9.100
  libpostproc    55.  9.100 / 55.  9.100
/tmp/tmpb7umuvf3: Is a directory`

Try with 'tmp_audio_path' or 'args.audio_path'

'tmp_dir_path' is a directory which stores a copy of the audio file. But the whisper audio load expects a path to the file itself.

Okay the issue is resolved using the librosa resampling

    audio, sr = librosa.load("data/test.mp3", sr=16000)  # Resample to 16kHz
    sf.write("data/resampled_test.wav", audio, sr)

But the model is only taking short chunks like 20-30 seconds, I want to transcribe the 60-minute audio files. If I divide those files into chunks then it will take same time as an original Whisper model!

I want to optimize the latency of the Whisper model

The whisper model is limited in the length of the audio it can process at once so the olive optimized model cannot resolve your long audio issue.

@prafulkl can we close this issue if you have no other questions?

Yeah please close this thread