Slow speech recognition
Closed this issue · 3 comments
Acquil commented
Takes around 30 sec to transcribe for 1 minute split in the audio file but a single 7 min chunk takes only 70 seconds, so larger chunks of audio are preferred.
Additional stats to report:
2 min 23 seconds elapsed for 28 min video/audio(split into 4 minute segments) with 8 subprocesses.
Acquil commented
Additional Tasks:
-
Benchmark with different options against different audio files.
-
Check if words get cut out in the middle while dividing into chunks.
- Might require changes in audiosplitter.py
To test please do the following
- Switch to speech_text branch
- Go to deep-read/workbench/speech/
- Run setup.sh
chmod +x setup.sh
./setup.sh
- Run speech.py and pass path to audio file as cmd-line argument.
chmod +x speech.py
./speech.py input.wav
You can convert a video to a .wav file with ffmpeg ffmpeg -i input.mp4 -vn deep-read/workbench/speech/output.wav
harish-ganesh commented
We will evaluate Speech Recognition with Indian(EN) model.