The codebase is expected to be compatible with Python 3.8-3.11 and recent PyTorch versions. The codebase also depends on a few Python packages, most notably OpenAI's tiktoken for their fast tokenizer implementation. You can setup the environment with the following command:
pip install requirements.txtIt also requires the command-line tool ffmpeg to be installed on your system, which is available from most package managers:
# if you use a conda environment
conda install ffmpeg
# on Ubuntu or Debian
sudo apt update && sudo apt install ffmpeg You can install bmwhisper as follow:
python setup.py installThe following command will transcribe speech in audio files using cpu, using the 'base' model:
bmwhisper demo.wav --model baseTo use TPU, firstly, you need to generate the onnx model:
./gen_onnx.sh --model base
# if you want to use kvcache
./gen_onnx.sh --model base --use_kvcacheThen, transform onnx model to bmodel:
./gen_bmodel.sh --model base
# if you want to use kvcache
./gen_bmodel.sh --model base --use_kvcache
# if you want to compare the data when transforming and deploying
./gen_bmodel.sh --model base --compareUse --inference button to allow TPU inference mode:
bmwhisper demo.wav --model base --inference