The audio was transcribed using Whispe. Speech segments were embedded and clustered to label each segment with the corresponding speaker for accurate diarization.
- Install requirements
conda env create -f path/to/environment.yml
- Run transcription. Sample audio inputs can be found in
/audio
and script output transcription will can bels under/transcriptions
.
python audio_transcribe.py