Node.js bindings for OpenAI's Whisper.
- Output transcripts to JSON (in addition to .txt .srt .vtt)
- Runs on CPU (not GPU)
- Timestamp accurate to a single word
Your project must use typescript to continue.
-
Add dependency to project
npm i whisper-node
-
Download whisper model of choice
npx whisper-node download-model base.en
import whisper from 'whisper-node';
const params = {
filePath: "example/sample.wav", // required
model: "medium", // default
output: "JSON", // default
}
const transcript = await whisper(params);
[
{
"tsB": "00:00:14.310", // time stamp begin
"tsE": "00:00:20.480", // time stamp end
"speech": "hey how's it going" // transcription
},
]
- [] Support for non-typescript projects
- [] Deprecate use of path package for browser and react-native compatibility
- [] fluent-ffmpeg to support mp3 and video ripping
- [] Pyanote diarization for speaker names