Whisper API Server

Quick Start

Download whisper-burn/tiny_en model
```
curl -LO https://huggingface.co/second-state/whisper-burn/resolve/main/tiny_en.tar.gz
```
Then, unzip the tiny_en.tar.gz file to get the tiny_en.mpk, tiny_en.cfg, and tokenizer.json files.
```
tar -xvzf tiny_en.tar.gz
```

Download audio file

curl -LO https://huggingface.co/second-state/whisper-burn/resolve/main/audio16k.wav

Download whisper-api-server.wasm binary

curl -LO https://github.com/LlamaEdge/whisper-api-server/raw/main/whisper-api-server.wasm

Start whisper-api-server
```
wasmedge --dir .:. \
  --nn-preload default:Burn:CPU:tiny_en.mpk:tiny_en.cfg:tokenizer.json:en \
  whisper-api-server.wasm
```
[!NOTE] The wasmedge-burn plugin is required to run the whisper-api-server.wasm binary. See Build plugin to build the plugin from source. For Apple Silicon users, you can download the plugin here.

Send curl request to the transcriptions endpoint

curl http://localhost:8080/v1/audio/transcriptions \
  -H "Content-Type: multipart/form-data" \
  -F file="@audio16k.wav"

If everything is set up correctly, you should see the transcriptions result:

{
    "text": " Hello, I am the whisper machine learning model. If you see this as text then I am working properly."
}

Build

Clone the repository

git clone https://github.com/LlamaEdge/whisper-api-server.git

Build the whisper-api-server.wasm binary
```
cd whisper-api-server

cargo build --release --target wasm32-wasi
```
If the build is successful, you should see the whisper-api-server.wasm binary in the target/wasm32-wasi/release directory.

CLI Options

$ wasmedge whisper-api-server.wasm -h

Whisper API Server

Usage: whisper-api-server.wasm [OPTIONS]

Options:
  -m, --model-name <MODEL_NAME>    Model name [default: default]
      --model-alias <MODEL_ALIAS>  Model alias [default: default]
      --socket-addr <SOCKET_ADDR>  Socket address of Whisper API server instance [default: 0.0.0.0:8080]
  -h, --help                       Print help
  -V, --version                    Print version

Run with Docker

./docker-build.sh

docker run \
  --runtime=io.containerd.wasmedge.v1 \
  --platform=wasi/wasm \
  --env WASMEDGE_WASINN_PRELOAD=default:Burn:GPU:/tiny_en.mpk:/tiny_en.cfg:/tokenizer.json:en \
  -p 8080:8080 \
  burn-whisper-server:latest

docker pull secondstate/burn-whisper-server:latest

docker run \
  --runtime=io.containerd.wasmedge.v1 \
  --platform=wasi/wasm \
  --env WASMEDGE_WASINN_PRELOAD=default:Burn:GPU:/tiny_en.mpk:/tiny_en.cfg:/tokenizer.json:en \
  -p 8080:8080 \
  secondstate/burn-whisper-server:latest

Then

curl http://localhost:8080/v1/audio/transcriptions \
  -H "Content-Type: multipart/form-data" \
  -F file="@audio16k.wav"