futo-org/whisper-acft

large models

eschmidbauer opened this issue · 4 comments

Hello-
Thank you for sharing this work. I've noticed how well these models work with smaller audio files. Is there any plans to release the large models?
Thanks

followup- i finetuned whisper large-v3 using the script in this repo and short audio clips still suffer from same issue.
is it possible that the large-v3 needs more training? If so, can you share some details on how to do that? Thanks again.

Hi, do you plan on sharing how you did? Especially GPU requirements, scripts, time needed, etc? Also what's your opinion on quantization aware fine tuning? IIRC it can potentially greatly reduce the size, increase the speed and at no additional computing cost.

Thanks! Do you plan on sharing the model? I'm also interested in:

GPU requirements, time needed,

and

Also what's your opinion on quantization aware fine tuning? IIRC it can potentially greatly reduce the size, increase the speed and at no additional computing cost.