Buffer bytes silently reused in internal queue in SendData results in repeated text in live transcriptions

Question

Buffer bytes silently reused in internal queue in SendData results in repeated text in live transcriptions

Closed this issue 6 months ago · 7 comments

What is the current behavior?

If I use something like NAudio to get audio data, internally NAudio uses a buffer. This means that when I call SendData, the buffer is stored. So when there is any latency or disconnections, the EnqueueForSending method actually enqueued the same bytes array, and sends it multiple times.

This makes things like "Hey , I think think think think think think that...". Without looking at the source, I wouldn't have known that the buffer is being queued.

Steps to reproduce

You can probably create artificial latency or disconnect the websockets to cause the problem. For example:

var buffer = new byte[3200];
// fill some values...
deepgramLive.SendData(buffer);
// fill some different values...
deepgramLive.SendData(buffer);
// now let the client dequeue, you'll see twice the same instance

Expected behavior

This is not obvious.

Document that the buffer cannot be reused, so people aren't surprised by random repetition.
Because we cannot control the internal buffers, you could copy the bytes to your own buffer and keep a list of them (a buffer pool). It's impossible to do from "outside" the lib.

Please tell us about your environment

Operating System/Version: Windows 11
Language: C#

Other information

The only solution to this problem right now is to systematically copy the bytes to a new array, which puts unnecessary pressure on the GC. I could also make a large buffers pool, but then instead of creating small buffers continuously, I'd reserve large amounts of memory which wouldn't ensure that it won't overflow anyway.

Answer 1 · 2023-09-11T21:54:38.000Z

Great callout, thanks!

Answer 2 · 2023-11-01T17:36:53.000Z

@acidbubbles the team discussed this issue a bit this week and we were curious in our next Major version of the SDK,
What if we eliminating the byte queue all together?

Answer 3 · 2023-11-02T02:46:12.000Z

I can't say with certainty what is the right approach but here's my thought.

People who implement the API may want to decide to either slow down their audio streaming, or rely on your implementation of queuing, or provide their own.

Right now, because the queuing is "broken" (at least for most cases of live audio libraries that I've seen that use a buffer), removing it altogether is a sound option.

What you could do however if you wanted to avoid your library consumers to naively implement it in a way that could slow down their app (if you remove the queuing, this means writing the bytes would be blocking right?) is provide a class for buffering locally, but as a wrapper / utility class instead. Something like new BufferedClientWriter(actualClient). If you do that (I'd use it!) just be careful of using immutable byte array representations.

Hope my humble opinion is helpful :)

Answer 4 · 2024-04-02T18:25:12.000Z

Will check to see if this still applies for v4.

Answer 5 · 2024-04-03T19:42:13.000Z

This should no longer be an issue in v4.

Answer 6 · 2024-04-04T12:54:03.000Z

What will be the strategy? There's no buffering/queuing at all, or did you implement a buffer copy?

Answer 7 · 2024-04-04T13:29:25.000Z

You are going to have access to an internal queue and also access directly to the send function using a buffer. These should show up in the next beta or the first RC.