New features: Automatically start and stop to record. Bip sound once text is insert.
ossossosso opened this issue · 2 comments
Hi,
my name is Daniele. I'm an italian stenographer.
Basically I transcribe what I hear in text using a steno keyboard.
I've interest to better understand OpenAI Whisper and his capability.
I don't have a good hardware, so I'd like to use OpenAI Speech to Text API.
I thank you very much for your project WhisperWriter:
I tried it and it works for me.
Not a lot of application on github which use Whisper for Speech to Text allow at the same time:
- Use a microphone as audio source.
- Use Whisper API instead of local model.
- Transcribe directly into any text editor.
So thanks for this opportunity.
It would be interesting to have two new features:
-
No need to press any shortcut to run record again.
I mean, once pressed shortcut like Ctrl Shift Spacebar the first time to run recording, once the audio recording is automatically stopped and text transcribed, It would be great if I don't need to press shortcut again, but a new recording starts automatically, waiting for my words.
I only need a new shortcut to stop recording definitively. -
Because I'm a blind user, would be useful a sort of "bip sound" which inform me when text is transcribed, in this case I know I can speak again.
thanks a lot.
Daniele.
Hi Daniele, thank you for your comments! I'm happy to hear that WhisperWriter has worked well for you :)
I appreciate your feature requests and I went ahead and added the option for a "beep" sound to play once the transcription has finished writing to the screen. After downloading my latest commit, you can turn the feature on by setting the noise_on_completion
configuration option to true
in src\config.json
. If you would like to change the sound that is made, you can replace "beep.wav" in the assets
folder, or change the file path on line 102 of main.py
.
Although there is not currently a pipelining feature like you described with the default voice activity detection method, you can change the way the app starts and stops recording to a key toggle. If you change recording_mode
in the configuration options to press_to_toggle
, the app will start listening when you press the keyboard shortcut and stop listening when you press it a second time, rather than waiting for you to finish speaking.
I hope you find these changes useful! Please let me know if there are any other features that would make the app work better for you :)
Thanks a lot! It works perfectly.