zaf/asterisk-speech-recog

Problem getting silence detection to work

lgaetz opened this issue · 11 comments

My dream is to plug vintage rotary dial phones into a standard ATA/TDM, pick up the receiver, speak the phone number and have PIAF connect me. Here is the core of the dial plan code I am playing with:

exten => 777,1,agi(googletts.agi,"After the beep, say the phone number you wish to dial",en)
exten => 777,n,agi(speech-recog.agi,en-US)
exten => 777,n,Noop(= Script returned: ${status} , ${id} , ${confidence} , ${utterance} =)
exten => 777,n(proceed),agi(googletts.agi,"I think you said: ${utterance}.  Please hold to dial",en)
exten => 777,n,Wait(3)
exten => 777,n,agi(googletts.agi,"Dialingl.",en) 
exten => 777,n,Dial(Local/${utterance}@outbound-allroutes)
exten => 777,n,Hangup

The problem is that using ver. 0.5 of the speech-recog.agi script, I can't get it to time out with silence detection, it always requires a # to terminate. Am I doing something wrong?

zaf commented

Hello Lorne,
The dialpan is right, the problem comes from the silence detection in
asterisk. Old phones or bad quality analog lines create
enough static noise to make asterisk fail in the detection of silence,
so the script keeps recording till you manually terminate it by
dialing #. To fix this we will have to patch asterisk itself and
change the silence threshold in the detection routines. Its not hard
to do this but also not very practical.

zaf commented

In case you are using an analog phone with dahdi hardware try and see if playing with the rxgain value can improve silence detection.

At the moment I am just using a SIP phone for testing. Adjustments to rx/tx gains may fix the silence detection issues but would probably impact actual conversations negatively, but I will give it a try at some point. If dtmf and silence detection are out, is there anything else I can do with a rotary phone and still use this script?

zaf commented

We can have a limit on the recording duration.
This option is already available in the script:
agi(speech-recog.agi,[lang],[timeout])
will record for 'timeout' seconds and then return the detected utterance.

Of course. I kind of feel dumb for asking now. Thanks for the quick replies.

Two more questions:

  1. From your usage notes, it looks to me like the timout value is meant to control the number of seconds of silence before it times out, not the absolute number of seconds. Can you clarify?
  2. I have tried passing multiple arguments to the script and I don't see any different behavior. If I wanted to do speech recognition using US english, 5 second timeout, an interrupt key 1 and nobeep, would the dialplan line look like this:
    exten => 777,n,agi(speech-recog.agi,en-US,5,1,nobeep)
zaf commented

You are right, the timeout option now controls the silence duration, recording timeout was the old behavior that is not available anymore. Sorry for that, its my turn to feel dumb :P

In your second question, you mean that dialing 1 doesn't terminate the recording? (the syntax is correct as far as I can tell, expect the 'nobeep' option that must be capitalized)

Okay, I have done some playing and it is mostly user error:

  1. I had a some stray spaces near the commas, Asterisk can't handle spaces between arguments
  2. I played more with silence detection, and even putting my SIP phone on mute I can't trigger a silence timeout.
  3. NOBEEP does work when it is all caps.

Given the issues around Asterisk and silence detection, I suggest that you bring back the absolute timeout originally used, perhaps by using a negative, i.e. positive timout values are silence detection and negative timeout values are absolute, with zero being used to disable?

zaf commented

Seems like a sane thing to do. I ll check it as soon as I find some free time and push some new code here.

I scrubbed my test server running FreePBX 2.10 and Asterisk 1.8.something-CERT and replaced with FreePBX 2.10 running on Asterisk 1.8.12.0. Silence detection is working fine now.

zaf commented

It seems like its an asterisk bug that should be reported upstream.