huggingface/peft

how to finetune whisper model with 'initial_prompt'

v-yunbin opened this issue · 4 comments

when use 'initial_prompt', the decoding result of finetuning with my data on whisper model v2 is bad, on the contrary, the result is good.
however, when use 'initial_prompt' the decoding result of based whisper model v2 is also good, so it means If want to use 'initial_prompt' during decoding , must add it when training?

Sorry, I don't understand your issue. Could you please explain it in more detail, what you want to achieve and how? Ideally show the code that leads to good or bad results.

HI, Now, whisper can use context information to improve recognition accuracy:
And, if you want pass context information to whisper, you can use arg for cli:
https://github.com/openai/whisper/blob/main/whisper/transcribe.py#L531
parser.add_argument("--initial_prompt", type=str, default=None, help="optional text to provide as a prompt for the first window.")
when finetune the whisper model, not use "--initial_prompt", decoding result of finetuned model with using "--initial_prompt" will be worse.

I see. I don't really have any expertise in whisper and how the initial prompt affects the outcome. But my best guess is that yes, if you want to use it, you should also use it during training, using the same logic as in the script that you linked.