Socket client deadlocks on azure machines with 2 CPU
Closed this issue · 12 comments
What is the current behavior?
deadlocks on machine with CPU < 8
Steps to reproduce
run client on 2 CPUs machine
Expected behavior
no deadlocks
Please tell us about your environment
azure web app 2 vCPUs - deadlocks
the same with 8 CPUs - no deadlocks.
Other information
investigation in progress. probably issue is in
void StartSenderBackgroundThread() => _ = Task.Factory.StartNew(
_ => ProcessSendQueue(),
TaskCreationOptions.LongRunning);
void StartReceiverBackgroundThread() => _ = Task.Factory.StartNew(
_ => ProcessReceiveQueue(),
TaskCreationOptions.LongRunning);
void StartKeepAliveBackgroundThread() => _ = Task.Factory.StartNew(
_ => ProcessKeepAlive(),
TaskCreationOptions.LongRunning);
void StartAutoFlushBackgroundThread() => _ = Task.Factory.StartNew(
_ => ProcessAutoFlush(),
TaskCreationOptions.LongRunning);
- not-async blocking operation on send:
// TODO: Add logging for message capturing for possible playback
Log.Verbose("ProcessSendQueue", "Sending message...");
lock (_mutexSend)
{
_clientWebSocket.SendAsync(message.Message, message.MessageType, true, _cancellationTokenSource.Token)
.ConfigureAwait(false);
}
Taking a look at this right now.
Yup 💯 Looking at fix for this.
problem is in message enquee (channel write/read). I replaced EnqueueSendMessage and ProcessSendQueue by direct _clientWebSocket.SendAsync and all work correctly.
Yup, found the same problem and working on the fix.
P.S. you can't do this:
lock (_mutexSend)
{
Log.Verbose("SendBinaryImmediately", "Sending binary message immediately..");
if (length == -1)
{
length = data.Length;
}
_clientWebSocket.SendAsync(new ArraySegment<byte>(data, 0, length), WebSocketMessageType.Binary, endOfMessage: true, _cancellationTokenSource.Token).ConfigureAwait(continueOnCapturedContext: false);
}
we can't do another send on the same web socket. but you don't wait SendAsync. _mutexSend will be unlocked before SendAsync will finish and we can enter this code again and have second send (first send still in progress) on the same socket.
I refactored this method in my project:
public async Task SendBinaryImmediately(byte[] data, int length = Constants.UseArrayLengthForSend)
{
if (!await _mutexSend.WaitAsync(SEND_MUTEXT_TIMEOUT, _cancellationTokenSource.Token))
{
Log.Error("SendBinaryImmediately", "Mutex timeout");
return;
}
try
{
Log.Verbose("SendBinaryImmediately", "Sending binary message immediately.."); // TODO: dump this message
if (length == Constants.UseArrayLengthForSend)
{
length = data.Length;
}
await _clientWebSocket.SendAsync(new ArraySegment<byte>(data, 0, length), WebSocketMessageType.Binary, true, _cancellationTokenSource.Token);
}
finally
{
_mutexSend.Release();
}
}
Yep, that was my conclusion as well. There is a PR open @vizakgh if you want to take a look
hi. thanks. I can't find in PR fix for this: #344 (comment)
you can reproduce this issue using big data chunks in SendAsync and slow network.
Should be in the link in the issue here
#345
The change is significant because of the backward compatibility guarantees that we need to keep
You need to either use:
- use the factory method which defaults to using the latest interface version here: https://github.com/deepgram/deepgram-dotnet-sdk/pull/345/files#diff-4d01ea1628d8d4a558a7f2121defde3049919f04b51c96a2f3ad6118d9777375R61
- OR, directly instantiate
v2
of the Listen WS Client (this also affects TTS WS as well)- Client
Deepgram.Clients.Listen.v2.WebSocket
: https://github.com/deepgram/deepgram-dotnet-sdk/pull/345/files#diff-c270781c56253048bf4dc89b737a52466dd35f015eda401aa8b541bfdb67bc48R5 - Model v2:
Deepgram.Models.Listen.v2.WebSocket
- Client
Hi @jcdyer Have you been able to verify this PR addresses your issue?
Closing as this has been merged. There is still time to check this out as I have another issue to take a look at. So feedback is always welcome.