Audio Transcription Script

This Python script automates the process of transcribing MP3 audio files to text using OpenAI's Whisper model. It's designed to efficiently handle large audio files by splitting them into smaller chunks and processing them in parallel.

Features

Accepts MP3 files as input
Splits audio into manageable chunks for efficient processing
Utilizes parallel processing for faster transcription
Outputs transcription to a text file
Handles errors gracefully, continuing even if individual chunks fail

Prerequisites

Before you begin, ensure you have met the following requirements:

Python 3.7 or higher
FFmpeg installed on your system

Installation

Clone this repository:

git clone https://github.com/your-username/audio-transcription-script.git
cd audio-transcription-script

Install the required Python packages:
```
pip install whisper pydub
```

Usage

Run the script from the command line, providing the path to your MP3 file:

python transcribe_audio.py path/to/your/audio.mp3

By default, this will create a file named transcription_output.txt in the same directory.

To specify a custom output file:

python transcribe_audio.py path/to/your/audio.mp3 -o path/to/output.txt

Options

-o, --output: Specify the path for the output text file (default: transcription_output.txt)

Performance

This script is optimized for multi-core processors and should perform well on systems like M1 Macs. The audio is split into 1-minute chunks by default, which are processed in parallel.

Limitations

Currently only supports MP3 input files
Transcription accuracy depends on the Whisper model used (default is "base")

Contributing

Contributions to improve the script are welcome. Please feel free to submit a Pull Request.

License

This project is open source and available under the MIT License.

Acknowledgments

This script uses OpenAI's Whisper model for transcription
Audio processing is handled by the pydub library

nnnnicholas/transcribe-audiio