/whisper-gui

A simple GUI to use Whisper.

Primary LanguagePythonMIT LicenseMIT

Whisper-GUI

A simple GUI made with gradio to use Whisper.

whisper-gui-img

Requirements

Optionally, to use Nvidia GPU on Windows:

Note: For AMD GPUs (ROCm), GPU support for Whisper is only available in Linux.

Set up

  • In Windows, run the whisper-gui.bat file. In Linux / macOS run the whisper-gui.sh file. Follow the instructions and let the script install the necessary dependencies. After the process, it will run the GUI in a new browser tab.

Otherwise, manual steps are:

  • Create a conda environment with Python 3.10
    conda create --name whisperx python=3.10
    conda activate whisperx
  • Install PyTorch 2.0
    For macOS:
    conda install pytorch::pytorch==2.0.0 torchaudio==2.0.0 -c pytorch
    For Windows or Linux, if you have Nvidia GPU:
    conda install pytorch==2.0.0 torchaudio==2.0.0 pytorch-cuda=11.8 -c pytorch -c nvidia
    For Linux, if you have AMD GPU:
    pip install torch==2.0.0 torchaudio==2.0.0 --index-url https://download.pytorch.org/whl/rocm6.0
    If not, install for CPU:
    conda install pytorch==2.0.0 torchaudio==2.0.0 cpuonly -c pytorch
  • Install whisperx and dependecies
    pip install git+https://github.com/m-bain/whisperx.git
    Original instructions in: https://github.com/m-bain/whisperX
  • Install additional libraries
    pip install gradio
  • Clone this repository
    git clone https://github.com/Pikurrot/whisper-gui

Run the GUI

To run the program every time, you can just run the same whisper-gui.bat or whisper-gui.sh (whatever your OS), which will also automatically check for updates of this repository.
Your transcriptions will be saved by default in the outputs folder of the repository.

Otherwise, to run manually:
conda activate whisperx
python main.py --autolaunch

Docker container (CPU only for now)

To run this software in a docker container, visit this dockerhub project.
Thank you 3x3cut0r!

Licensing

This project is primarily distributed under the terms of the MIT License. See the LICENSE file for details.

Third-Party Code
Portions of this project incorporate code from WhisperX, which is licensed under BSD-4-Clause license. This code is used in accordance with its license, and the full text of the license can be found within the relevant source files.