SYSTRAN/faster-whisper

Support for Distil-Whisper

sanchit-gandhi opened this issue ยท 12 comments

Hey @guillaumekln! Thanks for this fantastic resource. We're looking at supporting the Distil-Whisper checkpoints in faster-whisper.

The checkpoints are fairly easy to convert: we just pin the number of decoder layers to 2 always, and load the 32 encoder layers.

For inference, we found a chunk length of 15-20s to be optimal for WER performance of the distilled model, see Table 23 of the paper:

Screenshot 2023-11-01 at 17 05 58

Would you be open to a PR allowing the user to specify the chunk length and also the maximum generation length? This would enable full support of Distil-Whisper in faster-whisper!

Axbon commented

This would be amazing, one step closer to something that feels realtime*ish

Is the distilled large-v2 model still multilingual or does it lose that attribute due to how distilling was done?

Is the distilled large-v2 model still multilingual or does it lose that attribute due to how distilling was done?

It was trained on English audio only so it most probably lost its multilingual capabilities.

Is the distilled large-v2 model still multilingual or does it lose that attribute due to how distilling was done?

From their repo: "Note: Distil-Whisper is currently only available for English speech recognition. Multilingual support will be provided soon."

Waiting for a multilingual model for this task. Looking forward to it

Subscribing for updates!

can't wait to test it, this will be awesome.

hi @sanchit-gandhi. FYI, @guillaumekln's account seems inactive since September, which correlates with the moment he moved from his former company. I don't know if he plans to continue maintaining this repo or if other users such as OpenNMT devs (cc @vince62s @homink @nguyendc-systran) have ownership on the repo or forked it. Maybe consider forking it yourself under HF's Github namespace?

Hi, I confirm that I'm no longer actively maintaining this repo but other people can still make it move forward. Please ping @nguyendc-systran to merge changes in faster-whisper. For anything related to CTranslate2, please ping @vince62s.

Hi, I confirm that I'm no longer actively maintaining this repo but other people can still make it move forward. Please ping @nguyendc-systran to merge changes in faster-whisper. For anything related to CTranslate2, please ping @vince62s.

Great job on Ctranslate2 and faster-whisper, glad I came across it awhile ago now...and good luck in the future.

FYI, distil-whisper should now be supported by CTranslate2: https://github.com/OpenNMT/CTranslate2/releases/tag/v3.21.0

We "just" need to adapt faster-whisper in order to have faster-distil-whisper :)

FYI, I create a PR: #557 to support distil-whisper, hope it helps.