Forked to add a few conveniences:
- use webGPU for transcription by default
- use the
whisper-large-v3-turbo
model by default - name downloaded transcript with the same name as the input file (instead of transcript.txt)
- automatically download a text file transcript when transcription is complete
- downloaded text files will have a timestamp for each transcribed chunk of audio data (instead of a massive single paragraph of text)
- verify that the models get cached
- add drag and drop for input file uploads
- add more performance metrics for transcription
- refactor UI components to make them easier (imho) to understand, a-la clean code's recommendations for functions
-
Clone the repo and install dependencies:
git clone https://github.com/shola/whisper-web.git cd whisper-web pnpm install #optional, `npm` will work just fine
-
Run the development server:
pnpm run dev
Firefox users need to change the
dom.workers.modules.enabled
setting inabout:config
totrue
to enable Web Workers. Check out this issue for more details. -
Open the link (e.g., http://localhost:5173/) in your browser.