deepgram/deepgram-dotnet-sdk

Keep Alive functionality does not work unless a message has been received on the websocket

Closed this issue · 1 comments

What is the current behavior?

When opening a websocket and "holding" it open by sending Keep Alive messages, the websocket closes after 30 seconds. Note that no audio data is sent.

If a tiny amount of audio data is sent, Deepgram will not respond with a message over the websocket. In this scenario, the websocket also closes after 30 seconds.

If enough audio data is sent over the websocket so that Deepgram responds, the websocket stays open (which is the correct behavior, and sending the keep alive messages should result in the same behavior).

Steps to reproduce

Below is the reproduceable C# code:

using System;
using System.Text;
using Deepgram;
using Deepgram.Models;
namespace SampleApp;

class SampleApp
{
    public static async Task<int> Main()
    {
        // Set up the DG live client
        var credentials = new Credentials("<api_key>", "https://api.deepgram.com", null);
        var deepgramClient = new DeepgramClient(credentials);
        var deepgramLive = deepgramClient.CreateLiveTranscriptionClient();
        deepgramLive.ConnectionOpened += (object? sender, Deepgram.CustomEventArgs.ConnectionOpenEventArgs e) => { Console.WriteLine("Deepgram Connection Opened"); };
        deepgramLive.TranscriptReceived += (object? sender, Deepgram.Models.TranscriptReceivedEventArgs e) => { Console.WriteLine("Deepgram Message Received"); };
        deepgramLive.ConnectionClosed += (object? sender, Deepgram.CustomEventArgs.ConnectionClosedEventArgs e) => { Console.WriteLine("Deepgram Connection Closed"); };
        deepgramLive.ConnectionError += (object? sender, Deepgram.CustomEventArgs.ConnectionErrorEventArgs e) => { Console.WriteLine("Deepgram Connection Error"); };

        // Open the websocket
        var options = new Deepgram.Models.LiveTranscriptionOptions()
        {
            Tier = "base",
            Model = "general",
            Language = "en",
        };
        await deepgramLive.StartConnectionAsync(options);

        // Send keep-alive messages (without sending any audio) to keep the websocket open
        for (int i = 0; i < 60; i++)
        {
            Task.Delay(2000).Wait();
            if (System.Net.WebSockets.WebSocketState.Open == deepgramLive.State())
            {
                Console.WriteLine("Sending KeepAlive " + i);
                deepgramLive.KeepAlive();
            }
            else
            {
                // The websocket is no longer open... why???
                Console.WriteLine("Websocket changed state unexpectedly, it's " + deepgramLive.State() + " (iteration " + i + ")");
                break;
            }
        }

        return 0;
    }
}

To run this code/repo in GitHub's codespaces, edit the devcontainer.json file with the contents below:

// For format details, see https://aka.ms/devcontainer.json. For config options, see the
// README at: https://github.com/devcontainers/templates/tree/main/src/dotnet
{
	"name": "C# (.NET)",
	"image": "mcr.microsoft.com/devcontainers/dotnet:0-7.0",
	"features": {
		"ghcr.io/devcontainers/features/dotnet:1": {
			"installUsingApt": true,
			"version": "7"
		}
	}
}

You can then modify the SDK to debug directly.

Expected behavior

The websocket should not close after 30 seconds because Keep Alive messages are being sent.

Please tell us about your environment

I used GitHub codespaces - see above.

Other information

I implemented the same functionality in Python and the issue does not occur. This implies the issue is with the .NET code.

I dug into the SDK code carefully, and there is no indication of the Keep Alive messages not being sent to Deepgram. Additionally, if you comment out the line deepgramLive.KeepAlive();, the websocket closes after 12 seconds (which is intentional). This implies the Keep Alive is being sent over the websocket.

My best guess is that C# is closing the websocket if the websocket does not receive a message after 30 seconds - but that would be extremely surprising behavior.

Below is the Python code that is "identical" to the C# code. The websocket stays open during the entire lifetime of the code.

from deepgram import Deepgram
import asyncio
import time
import os


async def main():
    DEEPGRAM_API_KEY = os.environ["DEEPGRAM_API_KEY"]
    deepgram = Deepgram(DEEPGRAM_API_KEY)
    deepgramLive = await deepgram.transcription.live({
        "tier": "base",
        "model": "general",
        "language": "en",
    })

    # Add handlers
    deepgramLive.registerHandler(deepgramLive.event.CLOSE, lambda c: print(f"Connection closed with code {c}."))
    deepgramLive.registerHandler(deepgramLive.event.TRANSCRIPT_RECEIVED, print)

    # Listen for the connection to open and send streaming audio from the URL to Deepgram
    for i in range(60):
        time.sleep(2)
        if deepgramLive._socket.open:
            deepgramLive.keep_alive()
            print(f"Keep alive {i}")
        else:
            print("Deepgram connection closed unexpectedly!")
    time.sleep(1)

    # Indicate that we've finished sending data by sending the customary zero-byte message to the Deepgram streaming endpoint, and wait until we get back the final summary metadata object
    await deepgramLive.finish()

asyncio.run(main())

The issue is with the Keep Alive's WebSocketMessageType. It needs to be set to WebSocketMessageType.Text rather than WebSocketMessageType.Binary. PR incoming.