A simple GUI made with gradio
to use Whisper.
- Anaconda or Miniconda installed and
conda
added to PATH. git
installed and added to PATH. See instructions.ffmpeg
installed and added to PATH. See instructions for Windows, Linux or macOS.
Optionally, to use Nvidia GPU on Windows:
- CUDA version ≥12.0. Install from Nvidia's official site.
Note: For AMD GPUs (ROCm), GPU support for Whisper is only available in Linux.
- In Windows, run the
whisper-gui.bat
file. In Linux / macOS run thewhisper-gui.sh
file. Follow the instructions and let the script install the necessary dependencies. After the process, it will run the GUI in a new browser tab.
Otherwise, manual steps are:
- Create a conda environment with Python 3.10
conda create --name whisperx python=3.10
conda activate whisperx
- Install PyTorch 2.0
For macOS:
conda install pytorch::pytorch==2.0.0 torchaudio==2.0.0 -c pytorch
For Windows or Linux, if you have Nvidia GPU:
conda install pytorch==2.0.0 torchaudio==2.0.0 pytorch-cuda=11.8 -c pytorch -c nvidia
For Linux, if you have AMD GPU:
pip install torch==2.0.0 torchaudio==2.0.0 --index-url https://download.pytorch.org/whl/rocm6.0
If not, install for CPU:
conda install pytorch==2.0.0 torchaudio==2.0.0 cpuonly -c pytorch
- Install whisperx and dependecies
pip install git+https://github.com/m-bain/whisperx.git
Original instructions in: https://github.com/m-bain/whisperX - Install additional libraries
pip install gradio
- Clone this repository
git clone https://github.com/Pikurrot/whisper-gui
To run the program every time, you can just run the same whisper-gui.bat
or whisper-gui.sh
(whatever your OS), which will also automatically check for updates of this repository.
Your transcriptions will be saved by default in the outputs
folder of the repository.
Otherwise, to run manually:
conda activate whisperx
python main.py --autolaunch
To run this software in a docker container, visit this dockerhub project.
Thank you 3x3cut0r!
This project is primarily distributed under the terms of the MIT License. See the LICENSE file for details.
Third-Party Code
Portions of this project incorporate code from WhisperX, which is licensed under BSD-4-Clause license. This code is used in accordance with its license, and the full text of the license can be found within the relevant source files.