Add Audio Transcription Capability?

Question

Add Audio Transcription Capability?

doxgt opened this issue 4 months ago · 1 comments

Greetings.

I have been able to use the OpenAI Python library to send audio recordings to OpenAI for transcription (https://platform.openai.com/docs/guides/speech-to-text/quickstart).

However, I am wondering about WinHTTP based interactions with OpenAI as you demonstrated in your utility. And I'd always prefer AHK to tinkering with Python.

I am wondering if you happen to have any plan to add a module for uploading audio file for transcription.

If not, could you point me to the way on how to interface with the API in terms of uploading audio files? I kind of figured that I would be doing something along the line of ComObject("WinHttp.WinHttpRequest.5.1").SetRequestHeader("Content-Type", "multipart/form-data").

Instead of "https://api.openai.com/v1/chat/completions", the speech API URL is at "https://api.openai.com/v1/audio/transcriptions; and the API Model would be "whisper-1".

Then I am not sure where to go from there.

Many thanks in advance!

Answer 1 · 2024-05-19T17:57:01.000Z

I figured out how to use cURL to send audio files. No further actions needed here. Thanks for taking a look if you did.