YT2Text: YouTube Transcriber

This tool allows you to transcribe YouTube videos. It uses OpenAI's Whisper model for transcription, with AssemblyAI as a fallback for large files.

Features

Download audio from YouTube videos with real-time progress tracking
YouTube URL validation to ensure correct input format
Video information display (title, duration, uploader) before processing
Dual transcription service support:
- OpenAI Whisper for files ≤ 25MB (99% accuracy, 98+ languages)
- AssemblyAI as automatic fallback for files > 25MB (no size limit)
Automatic service selection based on file size
Duration warnings for long videos to avoid unexpected costs
Save transcriptions to a text file with character count information
Secure storage of API keys with proper file permissions
Comprehensive error handling and user feedback
Automatic cleanup of temporary files

Python 3.7 or higher
pip (Python package installer)
FFmpeg installed and available in your PATH (download from https://ffmpeg.org/download.html)
API Keys:
- OpenAI API key (required) - get it from https://platform.openai.com/api-keys
- AssemblyAI API key (optional, for large files) - get it from https://www.assemblyai.com/

See .env.example for environment variable setup. The actual environment file will be created automatically when you run the script.

Clone this repository:

git clone https://github.com/PierrunoYT/YT2Text.git
cd YT2Text

Create and activate a virtual environment:
```
python -m venv venv
venv\Scripts\activate
```
Install required packages:
```
pip install -r requirements.txt
```

Clone this repository:

git clone https://github.com/PierrunoYT/YT2Text.git
cd YT2Text

Create and activate a virtual environment:

python3 -m venv venv
source venv/bin/activate

Clone this repository:

git clone https://github.com/PierrunoYT/YT2Text.git
cd YT2Text

Create and activate a virtual environment:

python3 -m venv venv
source venv/bin/activate

Run the script:

python youtube_transcriber.py    # Windows
python3 youtube_transcriber.py   # macOS/Linux

Follow the prompts to:
- Enter your OpenAI API key (required, only on first run)
- Enter your AssemblyAI API key (optional, for large files, only on first run)
- Provide a YouTube video URL
- Specify the output file name for the transcription
The script will:
- Download the audio
- Automatically choose the appropriate transcription service based on file size
- Transcribe the audio and save it to a file

Files ≤ 25MB: Uses OpenAI Whisper (high accuracy, supports 98+ languages)
Files > 25MB: Automatically switches to AssemblyAI (no file size limit)
If AssemblyAI API key is not provided, the script will warn you about large files

API keys are stored securely in your home directory.
Temporary audio files are automatically cleaned up after processing.
On Linux and macOS, you might need to use python3 instead of python depending on your system configuration.
For long videos (over 1 hour), the script will provide a warning about potential costs and processing time.
AssemblyAI is automatically used for files larger than 25MB, eliminating the previous file size limitation.

This project is licensed under the MIT License - see the LICENSE file for details.