Real time feedback for words you flag as "bad"

check /example.gif

(note that it doesn't differentiate i'm from um very well, so I flagged i'm)

Steps

  1. Clone repo
  2. Download the speech recognition model and save to um_detector directory (rename model download as "model"): Here's a setup tutorial and the model's alphacephei api.
  3. Edit the bad words in run.py (or leave as is)
  4. Run pip install vosk; pip install sounddevice in terminal
  5. Run run.py

The reason for choosing this api is because it can process words in realtime (offline). To be able to get feedback on your presentation/zoom call, offline speech recognition is required. Other packages like SpeechRecognition are too slow and don't categorize "um" as a word.

Potential Next Steps

  1. Make the i'm vs. um distinction more robust
  2. Add feedback on tone shift
  3. Provide post talk summary of your presentation (most/least frequent words, total time not speaking, etc.)

Any ideas/modifications/comments are welcome.