/whisper-batch

An application to transcribe videos through open-ai whisper in docker

Primary LanguageDockerfile

Whisper Batch - Docker

  1. Build the docker container or pull it from the registry
docker build . -t whisper-batch
### OR
docker pull ghcr.io/davidmasp/whisper-batch:latest

This should take no more than a couple of minutes and you only need to run it once.

  1. Create local structure
mkdir -p videos
mkdir -p transcripts
## put your videos and audios in the folder!

you can change the names of these folder to whatever you like. Only put video/audio files in the videos folder!!

  1. Run the docker container
pathtovideos=$(realpath ./videos)
pathtotranscripts=$(realpath ./transcripts)
docker run -v ${pathtovideos}:/videos -v ${pathtotranscripts}:/transcripts -t whisper-batch
  1. Check the transcripts folders with the text files

Notes

See here for the whisper cpp implementation

See here for the list of available models.

See here for how to obtain the right audio format.