/WhisperOnline

speech recognition using openai whisper api

Primary LanguagePythonGNU General Public License v3.0GPL-3.0

OpenAI Whisper - Sample Implementation

This is a sample implementation of OpenAI whisper that works on Ubuntu 20.04 LTS.

About OpenAI Whisper

OpenAI Whisper is an API endpoint for transcribing voice input to text using machine learning algorithms. This sample implementation uses the OpenAI Whisper endpoint to transcribe audio input from a .wav file.

Installation

To get started, make sure you have Python and pip installed on your system. Then, run the following command to install the required packages:

pip install -r requirements.txt

You may also need to install some additional dependencies on your system via apt-get:

sudo apt-get install liblzma-dev libbz2-dev tk-dev portaudio19-dev

Setting Up Your OpenAI API Key

To use the OpenAI Whisper API endpoint, you must first set up an account with OpenAI and obtain an API key. Once you have your API key, set it as an environment variable named OPENAI_API_KEY.

Usage

To use this sample implementation, you'll need a .wav file containing the audio you want to transcribe. Once you have your .wav file, simply run the following command from your terminal:

python main.py

This will begin the transcription process and output the resulting text to your terminal window.

Note: The default configuration transcribes 5 seconds of audio from the beginning of the file. You can adjust the length of the audio to transcribe by modifying the "seconds" parameter value in the code.

Contributing

If you'd like to contribute to this project, feel free to submit a pull request! We welcome contributions from the community.

Conclusion

That's it! You should now have a working implementation of OpenAI Whisper on your system. If you have any questions or need help getting things set up, don't hesitate to reach out to the OpenAI community or file an issue here on GitHub.

BTW, did you notice that this readme is generated by chatGPT?