This repository is related to automatic speech recognition (ASR).
The repository includes the following .ipynb
files:
This notebook outlines the primary goals and objectives of the analysis. It includes instructions on how to download a video, extract audio, convert a text transcript to an SRT transcript, and describes the main tools and libraries used for transcription.
Open source project: whisper.cpp (based on OpenAI Whisper)
This notebook describes the results of using whisper.cpp, which is based on OpenAI Whisper.
Open source project: SeamlessM4T
This notebook describes the results of using the SeamlessM4T.
Open source project: faster-whisper (based on OpenAI Whisper)
This notebook describes the results of using faster-whisper, which is based on OpenAI Whisper.
Additionally, there are a few folders:
- The
data
folder contains the transcripts. MP3, WAV, and MP4 files are excluded due to their significant size, but they can be extracted as described in the.ipynb
files. - The
utils
folder contains several Python files that are excluded from the.ipynb
files to avoid overloading them with code. Links to these files are included in the.ipynb
files.