Investigate bundling distil-medium.en
fire opened this issue · 10 comments
fire commented
fire commented
See https://github.com/ggerganov/whisper.cpp/tree/master/models
May want to 5 bit quantize.
fire commented
./quantize ggml-large-32-2.en.bin ggml-large-5_1.en.bin q5_1
ggml_common_quantize_0: model size = 2884.75 MB
ggml_common_quantize_0: quant size = 556.53 MB | ftype = 9 (q5_1)
ggml_common_quantize_0: hist: 0.075 0.059 0.060 0.061 0.063 0.065 0.067 0.080 0.071 0.060 0.058 0.056 0.054 0.053 0.053 0.066
main: quantize time = 4291.86 ms
main: total time = 4291.86 ms
fire commented
Rename ggml-large-5_1.en.bin
to ggml-tiny.en.bin
. Probably needs to be a string path.
fire commented
!!!
Move the file to res://addons/godot_whisper/models/ggml-tiny.en.bin
fire commented
ggml-distilled-large-q5_1.en is too slow
fire commented
./quantize ggml-medium-32-2.en.bin ggml-tiny.en.bin q5_1
ggml_common_quantize_0: model size = 1504.42 MB
ggml_common_quantize_0: quant size = 293.35 MB | ftype = 9 (q5_1)
ggml_common_quantize_0: hist: 0.073 0.058 0.059 0.061 0.064 0.066 0.069 0.081 0.072 0.061 0.059 0.056 0.054 0.052 0.052 0.065
main: quantize time = 2933.80 ms
main: total time = 2933.80 ms
Failed because too slow too.
Trying ggml-distilled-large-q5_1 but looking into gpu optimization
fire commented
fire commented
Ughuuu commented
I added it to the option dropdown for small size. It can also be downloaded manually if people want. It's just not that fast imo.
fire commented
Thanks