drachtio/drachtio-freeswitch-modules

Audio stream delay in load testing

Opened this issue · 0 comments

Hi Dave,
Thank you so much for your fantastic FreeSwitch modules!

I got the same audio delay issue in my system.

The system

| FreeSwitch (enquiped with AudioFork) | => | Callbot IVR server (powered by eslgo) | => | ASR server |

When a user makes a call to FS (FreeSwitch) docker container (I used FS as a docker container with network_mode="host"),
then the FS will make an outbound ESL connection to the IVR server. In the IVR server, I send an uuid_audio_fork command to request the audio stream of the call from the FS to the IVR server, then the audio stream will be sent to the ASR server for the text of the user's speeches (please note that the IVR server is used as audio forwarder only, I do not implement any audio processing mechanism in this service).

Configuration

  • The MOD_AUDIO_FORK_SERVICE_THREADS is set to 5
  • Hardware:
    • VM1 (8 CPU, 16 GB): hosting FS and IVR service
    • VM2 (8 CPU, 64 GB, GTX 1080 GPU): hosting ASR service

Test scenario

a/ FS test:

I use my custom sip application for generating 5 concurrent calls to the FS server, after the application got the ANSWER response from the FS,
it will start sending audio bytes to the opened FS's RTP socket.

Here are some metrics that I trace for each call (there is only 1 audio to send to FS for each call):

  • The start time when the sip application starts sending the audio (t1)
  • The time when the IVR server gets the ASR's final text result of the above audio (t2). My ASR server could handle 8 streams concurrently.

So far, I have ended up with a record table that has 3 columns (call_id, t1, t2) and 5 rows

b/ ASR isolation test:

I use a custom tool to concurrently generate 5 streams of the above audios to the ASR server and record the same metrics:

  • The start time when the tool starts sending the audio
  • The time when the tool gets the ASR's final text result

Then I compare this metric table with scenario (a) and realize that 30% of the calls' audio from (a) got much delay period in getting the final ASR result when compared to the same audio in scenario (b).

With the above hardware, we could not handle 5 concurrent calls smoothly and my boss does not happy with this :(

Could you please show me a clue to tweak mod_audio_fork for better performance?
Thank you so much!