large models

Question

large models

eschmidbauer opened this issue 6 months ago · 4 comments

Hello-
Thank you for sharing this work. I've noticed how well these models work with smaller audio files. Is there any plans to release the large models?
Thanks

Answer 1 · 2024-07-01T14:04:25.000Z

followup- i finetuned whisper large-v3 using the script in this repo and short audio clips still suffer from same issue.
is it possible that the large-v3 needs more training? If so, can you share some details on how to do that? Thanks again.

Answer 2 · 2024-07-04T11:49:06.000Z

Hi, do you plan on sharing how you did? Especially GPU requirements, scripts, time needed, etc? Also what's your opinion on quantization aware fine tuning? IIRC it can potentially greatly reduce the size, increase the speed and at no additional computing cost.

Answer 3 · 2024-07-19T19:35:39.000Z

https://gist.github.com/eschmidbauer/c1bb441028a61db19d833a289688e8f6
slightly modified script provided in repo

Answer 4 · 2024-07-19T20:21:58.000Z

Thanks! Do you plan on sharing the model? I'm also interested in:

GPU requirements, time needed,

and

Also what's your opinion on quantization aware fine tuning? IIRC it can potentially greatly reduce the size, increase the speed and at no additional computing cost.