davidfoerster/KaleidOK-examples

Automatic, threshold-based speech interval detection

Opened this issue · 2 comments

Automatic speech detection is needed, or an alternative if it conflicts with something else. It is too difficult to record In and Out (or just In now since the 8sec cut-off is implemented).
Pressing record In key once to start a session would be sufficient, then every 8 seconds could it loop-record? Until a final Out key when the user is finished.

Automatic restarts of the record are difficult, because we would need to make sure, that we don't cut off the speaker mid-word. This would require similar algorithms as automatic recording starts and ends.

The maximum recording interval length can be configured with com.getflourish.stt2.stt.interval as described in the documentation.

Via Jabber with @Disastergirl today (links included by the editor):

(13:45:55) @davidfoerster: the most promising solution to that, seems to be sphinx, since it imposes no limit on the length of speech records and includes a volume level detection for automatic activation.

(13:46:30) @davidfoerster: the biggest issue is its performance (accuracy vs. speed) from our preliminary findings.

(13:48:15) @davidfoerster: from the point of view of software engineering the integration of sphinx into kaleidok is simple enough, maybe a day's work.

See also #14.