k2-fsa/sherpa-ncnn

Convert model

Closed this issue · 2 comments

I built sherpa-ncnn with MINGW (Windows) and all work good and i get well results.

wav filename: 1089-134686-0001.wav
wav duration (s): 6.625
Started!
Done!
Recognition result for 1089-134686-0001.wav
text:  AFTER EARLY NIGHTFALL THE YELLOW LAMPS WOULD LIGHT UP HERE AND THERE THE SQUALID QUARTER OF THE BRAFFLES
timestamps: 0.32 0.92 1 1.12 1.4 1.56 1.6 1.76 1.96 2.08 2.2 2.24 2.4 2.52 2.56 2.64 2.8 3 3.24 3.48 3.6 3.72 3.92 4.36 4.48 4.52 4.76 4.84 5 5.04 5.16 5.2 5.44
 5.6 5.8 5.84 5.96 6.04 6.24
Elapsed seconds: 2.155 s
Real time factor (RTF): 2.155 / 6.625 = 0.325

But i cant find anywhere some converter (or converter.py) for original whisper models to NCNN model.
For example if i want inference "medium" multi-language model from whisper.

sherpa-ncnn does not support whisper models.

https://github.com/k2-fsa/sherpa-onnx supports whisper. Its usage is very similar to that of sherpa-ncnn.

You can find converted whisper models for sherpa-onnx at
https://k2-fsa.github.io/sherpa/onnx/pretrained_models/whisper/index.html

Also i tested small int8 model on Orange Pi ..
It's good. thank You for your work,
root@orangepi2:~/software/sherpa-ncnn/build# ./bin/sherpa-ncnn-microphone
ALSA lib pcm.c:2666:(snd_pcm_open_noupdate) Unknown PCM cards.pcm.front
ALSA lib pcm.c:2666:(snd_pcm_open_noupdate) Unknown PCM cards.pcm.rear
ALSA lib pcm.c:2666:(snd_pcm_open_noupdate) Unknown PCM cards.pcm.center_lfe
ALSA lib pcm.c:2666:(snd_pcm_open_noupdate) Unknown PCM cards.pcm.side
Num devices: 5
Use default device: 3
Name: default
Max input channels: 128
Started
0: all right rich price thank you so much fur taken the time to
join us here is there anything else that you want to add before
i let you go