This tool automates the transcription of audio files. It takes an MP3 file, splits it into 10-minute segments, transcribes each segment using OpenAI's Whisper API, and compiles the transcriptions into a single text file.
- Splits audio files into 10-minute segments.
- Transcribes audio segments using OpenAI's Whisper API.
- Compiles transcriptions into a single text file.
- Avoids data overwrite without user confirmation.
- Python 3.6 or higher.
- pip for installing dependencies.
- Virtualenv for creating an isolated Python environment (optional, but recommended).
- An OpenAI API key with access to the Whisper API.
-
Clone the repository to your local machine.
git clone https://github.com/your-username/your-repo-name.git cd your-repo-name
-
(Optional) Create a virtual environment to isolate the project dependencies. Replace
env_name
with your desired environment name.python3 -m venv env_name
-
Activate the virtual environment.
- On macOS and Linux:
source env_name/bin/activate
- On Windows:
.\env_name\Scripts\activate
-
Install the required Python packages within the activated virtual environment.
pip install -r requirements.txt
-
Set up your OpenAI API key by creating a
.env
file in the root directory and adding your key to it.echo OPENAI_API_KEY='your-api-key' > .env
Activate your virtual environment if it is not already activated and run the script with the path to your audio file.
sh source env_name/bin/activate # On macOS and Linux .\env_name\Scripts\activate # On Windows python main.py path_to_your_audio_file.mp3
The script will process the audio file, transcribe it, and save the transcription in a dedicated directory named after the audio file.
- To adjust the length of audio segments, modify the
ten_minutes
variable in theAudioProcessor
class.
We welcome contributions! Please fork the repository and submit a pull request with your suggested changes.
This project is released under the MIT License.