WhisperClip simplifies your life by automatically transcribing audio recordings and saving the text directly to your clipboard. With just a click of a button, you can effortlessly convert spoken words into written text, ready to be pasted wherever you need it. This application harnesses the power of OpenAI's Whisper for free, making transcription more accessible and convenient.
- Record audio with a simple click.
- Automatically transcribe audio using Whisper (free).
- Option to save transcriptions directly to the clipboard.
- Python 3.8 or higher
- CUDA is highly recommended for better performance but not necessary. WhisperClip can also run on a CPU.
-
Clone the repository:
git clone https://github.com/gustavostz/whisper-clip.git cd whisper-clip
-
Install PyTorch if you don't have it already. Refer to PyTorch's website for installation instructions.
-
Install the required dependencies:
pip install -r requirements.txt
Based on your GPU's VRAM, choose the appropriate Whisper model for optimal performance. Below is a table of available models with their required VRAM and relative speed:
Size | Required VRAM | Relative speed |
---|---|---|
tiny | ~1 GB | ~32x |
base | ~1 GB | ~16x |
small | ~2 GB | ~6x |
medium | ~5 GB | ~2x |
large | ~10 GB | 1x |
For English-only applications, .en
models (e.g., tiny.en
, base.en
) tend to perform better.
To change the model, modify the model_name
variable in config.json
to the desired model name.
Run the application:
python main.py
- Click the microphone button to start and stop recording.
- If "Save to Clipboard" is checked, the transcription will be copied to your clipboard automatically.
- The default shortcut for toggling recording is
Alt+Shift+R
. You can modify this in theconfig.json
file. - You can also change the Whisper model used for transcription in the
config.json
file.
If there's interest in a more user-friendly, executable version of WhisperClip, I'd be happy to consider creating one. Your feedback and suggestions are welcome! Just let me know through the GitHub issues.
This project uses OpenAI's Whisper for audio transcription.