Azure Whisper无法识别语音

Question

Azure Whisper无法识别语音

Closed this issue 9 months ago · 6 comments

docker-compose配置：

Options: [openai-whisper, azure-whisper, azure-stt]

 STT_TYPE: azure-whisper

Azure Whisper

 AZURE_WHISPER_API_BASE: https://northcentralus.api.cognitive.microsoft.com/
 AZURE_WHISPER_KEY: **************
 AZURE_WHISPER_DEPLOYMENT_NAME: whisper
 AZURE_WHISPER_API_VERSION: 2023-09-01-preview

日志显示：
Starting new HTTPS connection (1): northcentralus.api.cognitive.microsoft.com:443
folotoy-folotoy-1 | 2024-01-05 11:08:51,250 - DEBUG - https://northcentralus.api.cognitive.microsoft.com:443 "POST //openai/deployments/whisper/audio/transcriptions?api-version=2023-09-01-preview HTTP/1.1" 404 198
folotoy-folotoy-1 | 2024-01-05 11:08:51,252 - DEBUG - [Dkey=F234103024] STT(azure-whisper) request time cost: 1.74s
folotoy-folotoy-1 | 2024-01-05 11:08:51,252 - ERROR - LLM error: Traceback (most recent call last):
folotoy-folotoy-1 | File "core/speech_wav_processor.py", line 100, in core.speech_wav_processor.SpeechWavProcessor.write_wav
folotoy-folotoy-1 | KeyError: 'text'
folotoy-folotoy-1 | Traceback (most recent call last):
folotoy-folotoy-1 | File "core/speech_wav_processor.py", line 100, in core.speech_wav_processor.SpeechWavProcessor.write_wav
folotoy-folotoy-1 | KeyError: 'text'

我反复确认，key是没错的

Answer 1 · 2024-01-05T03:47:02.000Z

AZURE_WHISPER_API_BASE 不是用 cognitive 的地址，用 Azure OpenAI 的 url，例如部署的 Azure OpenAI 资源名字是 xxxx，那么 url 是：
https://xxxx.openai.azure.com/

Answer 2 · 2024-01-05T03:49:52.000Z

在这里查看：https://speech.microsoft.com/portal/whisperspeechtotext

Answer 3 · 2024-01-05T04:42:15.000Z

可是我的地址并不是openai.azure.com啊

Answer 4 · 2024-01-05T04:52:14.000Z

参考一下这个文档的设置：
https://learn.microsoft.com/zh-cn/azure/ai-services/openai/whisper-quickstart?tabs=command-line&pivots=rest-api#rest-api

Answer 5 · 2024-01-05T04:57:25.000Z

这个endpoint是分配的，无法修改

Answer 6 · 2024-01-05T09:05:44.000Z

需要正确填写自己部署的 model 的 deployment。