deepgram/deepgram-js-sdk

Mismatch client protocol when running listen.live on bun

Closed this issue · 2 comments

What is the current behavior?

When running listen.live on bun the client fails to initialize and start the connection and instead immediately throws and error of client mismatch protocol

What's happening that seems wrong?
Client should start the connection

Steps to reproduce

To make it faster to diagnose the root problem. Tell us how can we reproduce the bug.
install deps and create two files (ubuntu machine with PulseAudio)
make test.js file:

import { createClient, LiveTranscriptionEvents } from '@deepgram/sdk';
import { spawn } from 'child_process';
import { Readable } from 'stream';

export class AudioStream extends Readable {
  constructor() {
    super();

    this.pacat = spawn('pacat', [
      '--record',
      '--format=s16le', // 16-bit little-endian
      '--rate=16000', // 16kHz sample rate
      '--channels=1', // Mono
      '--latency-msec=50', // Adjust this value as needed
    ]);

    this.pacat.stdout.on('data', (chunk) => {
      this.push(chunk);
    });

    this.pacat.stderr.on('data', (data) => {
      console.error(`pacat stderr: ${data.byteLength}`);
    });

    this.pacat.on('close', (code) => {
      console.log(`pacat process exited with code ${code}`);
      this.push(null);
    });

    this.pacat.on('error', (error) => {
      console.error(`pacat process error: ${error.message}`);
    });
  }

  _read() {}

  stopRecording() {
    if (this.pacat) {
      this.pacat.kill();
    }
  }
}

const startConnection = async () => {
  const deepgram = createClient('<api-key>');
  const LiveAudioStream = new AudioStream();
  const connection = deepgram.listen.live({
    model: 'nova-2',
  });

  connection.on(LiveTranscriptionEvents.Open, async () => {
    connection.getReadyState()
      ? console.log('Connection opened')
      : console.error('Connection failed to open');
    LiveAudioStream.on('data', (chunk) => {
      if (connection && connection.getReadyState() === 1) {
        // convert back to raw audio
        connection.send(chunk);
      }
    });
  });

  connection.on(LiveTranscriptionEvents.Close, (event) => {
    console.log('Connection closed', event);
  });

  connection.on(LiveTranscriptionEvents.Transcript, (results) => {
    console.log('Received transcription results', results);
  });

  connection.on(LiveTranscriptionEvents.Metadata, (metadata) => {
    console.log('Received metadata', metadata);
  });

  connection.on(LiveTranscriptionEvents.Error, (error) => {
    console.error('An error occurred', error);
  });

  //   connection.on(LiveTranscriptionEvents, (warning) => {
  //     console.warn("Received a warning", warning);
  //   });

  return connection;
};

void startConnection();

then run it by bun run test.js -> you will immediately get client mismatch (latest SDK and also 3.0.0)

Expected behavior

What would you expect to happen when following the steps above?

Please tell us about your environment

We want to make sure the problem isn't specific to your operating system or programming language.

  • Operating System/Version: Windows 10 with WSL2
  • Language: Typescript
  • Browser: ---

Other information

Anything else we should know? (e.g. detailed explanation, stack-traces, related issues, suggestions how to fix, links for us to have context, eg. stack overflow, codepen, etc)

Connection closed CloseEvent {
  isTrusted: true,
  wasClean: true,
  code: 1002,
  reason: "Mismatch client protocol",
  type: "close",
  target: WebSocket {
    URL: "wss://api.deepgram.com/v1/listen?model=nova-2",
    url: "wss://api.deepgram.com/v1/listen?model=nova-2",
    readyState: 3,
    bufferedAmount: 0,
    onopen: [Function],
    onmessage: [Function],
    onerror: [Function],
    onclose: [Function],
    protocol: "",
    extensions: "",
    binaryType: "nodebuffer",
    send: [Function: send],
    close: [Function: close],
    ping: [Function: ping],
    pong: [Function: pong],
    terminate: [Function: terminate],
    CONNECTING: 0,
    OPEN: 1,
    CLOSING: 2,
    CLOSED: 3,
    addEventListener: [Function: addEventListener],
    removeEventListener: [Function: removeEventListener],
    dispatchEvent: [Function: dispatchEvent],
  },
  currentTarget: WebSocket {
    URL: "wss://api.deepgram.com/v1/listen?model=nova-2",
    url: "wss://api.deepgram.com/v1/listen?model=nova-2",
    readyState: 3,
    bufferedAmount: 0,
    onopen: [Function],
    onmessage: [Function],
    onerror: [Function],
    onclose: [Function],
    protocol: "",
    extensions: "",
    binaryType: "nodebuffer",
    send: [Function: send],
    close: [Function: close],
    ping: [Function: ping],
    pong: [Function: pong],
    terminate: [Function: terminate],
    CONNECTING: 0,
    OPEN: 1,
    CLOSING: 2,
    CLOSED: 3,
    addEventListener: [Function: addEventListener],
    removeEventListener: [Function: removeEventListener],
    dispatchEvent: [Function: dispatchEvent],
  },
  eventPhase: 2,
  cancelBubble: false,
  bubbles: false,
  cancelable: false,
  defaultPrevented: false,
  composed: false,
  timeStamp: 0,
  srcElement: WebSocket {
    URL: "wss://api.deepgram.com/v1/listen?model=nova-2",
    url: "wss://api.deepgram.com/v1/listen?model=nova-2",
    readyState: 3,
    bufferedAmount: 0,
    onopen: [Function],
    onmessage: [Function],
    onerror: [Function],
    onclose: [Function],
    protocol: "",
    extensions: "",
    binaryType: "nodebuffer",
    send: [Function: send],
    close: [Function: close],
    ping: [Function: ping],
    pong: [Function: pong],
    terminate: [Function: terminate],
    CONNECTING: 0,
    OPEN: 1,
    CLOSING: 2,
    CLOSED: 3,
    addEventListener: [Function: addEventListener],
    removeEventListener: [Function: removeEventListener],
    dispatchEvent: [Function: dispatchEvent],
  },
  returnValue: true,
  composedPath: [Function: composedPath],
  stopPropagation: [Function: stopPropagation],
  stopImmediatePropagation: [Function: stopImmediatePropagation],
  preventDefault: [Function: preventDefault],
  initEvent: [Function: initEvent],
  NONE: 0,
  CAPTURING_PHASE: 1,
  AT_TARGET: 2,
  BUBBLING_PHASE: 3,
}

I speculate if whether because Bun has a built-in websocket, the code is using the class with a subprotol and bun doesn't like that. I also wonder if this will break using Node's new experimental native websocket support. Hmmmmmmm

I speculate if whether because Bun has a built-in websocket, the code is using the class with a subprotol and bun doesn't like that. I also wonder if this will break using Node's new experimental native websocket support. Hmmmmmmm

I tend to agree, I'm not sure what's the expected protocol from deepgram servers but It would be best if we could polyfill it.
It's a blocker for me right now, I'm trying to move my stack from assembly ai to deepgram but been breaking my head on this for the past few hours with not success.