Whisper Real Time Document
Project whisper-real-time.
Project description
What can whisper-real-time do for you?
Just as the name suggests, it is a real time offline transcriber with GUI.
Features
Why choose whisper-real-time?
1. Real time:
customise the delay seconds yourself.
2. Three modes of transcription:
Record mode, Real Time mode, Live mode.
3. Offline totally:
no online API, no privacy issues, no time limits.
Usage
1. The first tab
You could record the audio and transcribe it in the first tab.
Play: play the audio file selected (or double-click the item in the table).
Delete: delete the audio file selected.
Record: start recording.
Submit: stop recording and transcribe the audio record.
2. The second tab
The second tab is the real time zone, it will transcribe your voice continuously.
Begin: start recording and transcribe automatically.
End: stop recording and transcribe the whole audio record.
3. The third tab
In the third tab, you could control the pace of transcription yourself.
Live: start recording or transcribe the previous part and keep recording.
Continue: give up the previous part and keep recording.
stop: stop recording and give up the transcription.
Finish: stop recording and transcribe the whole audio record.
Installation
It is build on the base of OpenAI whisper, so you need to follow the instruction of installing whisper.
whisper document: https://github.com/openai/whisper
Main steps of Installation:
1. Download and install software
Download and install the following software (go to next step if installed already):
Python 3.10: https://www.python.org/downloads/release/python-31010/
FFmpeg: https://ffmpeg.org/download.html
2. Install the python packages
Install the python packages automatically by requirements.txt:
pip install -r requirements.txt
main.py
3. Run Run the main.py
file with python.
Tips
1. First time installation
The process of downloading whisper model to your device might take about several minutes, so it needs patience when the first time you start the transcription after the installation.
2. First transcription task after startup
The process of loading whisper model to RAM might take about ten seconds, so it is better to waite until the text box show the words when the first transcription task start after you start up whisper-real-time every time.
3. Customization
The default configuration is not suit for everyone, so make some modifications in language or model type may make the result better.
4. Supported Platforms
Python can be run on varies of operating systems, but whisper-real-time was developed on Windows system, hadn't been verified in other systems such as Mac and Linux, so there may be some bugs on these systems...
Reference
Official Website: https://dovej.com
Document: https://dovej.com/blog/003