Wyoming protocol server for the faster-whisper speech to text system.
NVIDIA GPU, driver , cuda installed
Set up Python virtual environment:
Virtual Python env (Only for Conda using)
conda create -n fwf python=3.10
conda activate fwf
Clone the repository
git clone https://github.com/neowisard/wyoming-faster-whisper.git
cd wyoming-faster-whisper
pip install -r requirements.txt
Download large model to data (/ai/models/whisper) dir
git clone https://huggingface.co/Systran/faster-whisper-large-v3
mv fwhisper-large-v3 large-v3
or quantized to INT8 (smaller and faster Largev3)
git clone https://huggingface.co/neowisard/fwhisper-large-v3-int8
mv fwhisper-large-v3-int8 large-v3i
Run a server anyone can connect to: Full precision
python3 -m wyoming_faster_whisper --uri 'tcp://0.0.0.0:10300' --data-dir /ai/models/whisper --model large-v3 --beam-size 1 --language ru --download-dir /ai/models/whisper --compute-type float16 --device cuda --initial-prompt "promt"
Quantized INT8
python3 -m wyoming_faster_whisper --uri 'tcp://0.0.0.0:10300' --data-dir /ai/models/whisper --model large-v3i --beam-size 1 --language ru --download-dir /ai/models/whisper --compute-type int8_float32 --device cuda