large models
eschmidbauer opened this issue · 4 comments
Hello-
Thank you for sharing this work. I've noticed how well these models work with smaller audio files. Is there any plans to release the large models?
Thanks
followup- i finetuned whisper large-v3 using the script in this repo and short audio clips still suffer from same issue.
is it possible that the large-v3 needs more training? If so, can you share some details on how to do that? Thanks again.
Hi, do you plan on sharing how you did? Especially GPU requirements, scripts, time needed, etc? Also what's your opinion on quantization aware fine tuning? IIRC it can potentially greatly reduce the size, increase the speed and at no additional computing cost.
https://gist.github.com/eschmidbauer/c1bb441028a61db19d833a289688e8f6
slightly modified script provided in repo
Thanks! Do you plan on sharing the model? I'm also interested in:
GPU requirements, time needed,
and
Also what's your opinion on quantization aware fine tuning? IIRC it can potentially greatly reduce the size, increase the speed and at no additional computing cost.