This Python script is a tool for recording your voice and transcribing it using OpenAI's Whisper API. The tool records audio in 10-minute chunks and saves the recordings to an MP3 file with a timestamp. The tool then transcribes each chunk using the Whisper API and saves the transcriptions to a text file with a timestamp. Finally, the tool prints the transcriptions to the terminal.
graph TD
A[Start Recording] --> B(Record Audio)
B --> C(Save Recording to WAV File)
C --> D(Convert WAV to MP3)
D --> E(Split MP3 into 10-minute Chunks)
E --> F(Transcribe Each Chunk Using Whisper API)
F --> G(Save Transcriptions to Text File)
G --> H(Print Transcriptions to Terminal)
- Python 3.6 or higher
- PyAudio
- wave
- pydub
- openai
-
Clone the repository:
git clone https://github.com/icereed/openai-whisper-voice-transcriber.git
-
Install the required packages:
pip install pyaudio wave pydub openai
-
Get the OpenAI API key from https://platform.openai.com/account/api-keys
-
Set your OpenAI API key as an environment variable:
export OPENAI_API_KEY=yourapikey
Alternatively, you can replace
os.environ.get("OPENAI_API_KEY")
with your actual API key in the script.
-
Run the script:
python audio_recorder.py
-
Press
Ctrl+C
to stop recording. -
The script will save the recorded audio to an MP3 file and split it into 10-minute chunks.
-
The script will transcribe each chunk using the Whisper API and save the transcriptions to a text file.
-
The script will print the transcriptions to the terminal.
You can customize the following constants in the script:
CHUNK
: the number of audio samples per frameFORMAT
: the audio formatCHANNELS
: the number of audio channels (mono or stereo)RATE
: the audio sample rateMAX_RETRIES
: the maximum number of retries for failed transcriptionsRETRY_DELAY
: the delay in seconds between retriesOUTPUT_INTERVAL
: the output interval in seconds for total length of recording
If you encounter errors regarding ffmpeg not being found, you can install ffmpeg using Homebrew:
brew install ffmpeg
If you encounter issues when installing PyAudio on an M1 Mac, you can follow the steps below to troubleshoot the issue:
-
Install
portaudio
using Homebrew:brew install portaudio
-
Link
portaudio
using Homebrew:brew link portaudio
-
Copy the path where
portaudio
was installed. You can get this by running:brew --prefix portaudio
-
Create a
.pydistutils.cfg
file in your home directory:sudo nano $HOME/.pydistutils.cfg
-
Paste the following content into the file:
[build_ext] include_dirs=<PATH FROM STEP 3>/include/ library_dirs=<PATH FROM STEP 3>/lib/
Replace
<PATH FROM STEP 3>
with the path you copied in step 3. -
Install
pyaudio
usingpip
:pip install pyaudio
These steps should help you install PyAudio on an M1 Mac. If you encounter any issues, please check the PyAudio documentation or seek help from the PyAudio community.
This project is licensed under the AGPL 3.0 license. See the LICENSE file for details.