dotnet/WatsonTcp

Deadlock when sending data from server in MessageReceived handler

sancheolz opened this issue · 7 comments

I run this code on net 6 on ubuntu 22.04

using WatsonTcp;

namespace WatsonDeadlock; // Note: actual namespace depends on the project name.
internal class Program
{
    private static byte[] _serverData;
    private static WatsonTcpServer _tcpServer;
    private static bool _all;
    private static int _countReceivedMessages;
    private static Guid _clientGuid;

    static void Main(string[] args)
    {
        var clientData = File.ReadAllBytes("/home/sancheolz/client2834.txt");
        _serverData = File.ReadAllBytes("/home/sancheolz/server3304.txt");
        _tcpServer = new WatsonTcpServer(null, 9956);
        _tcpServer.Events.ClientConnected += OnClientConnected;
        _tcpServer.Events.MessageReceived += OnTcpMessageReceived;
        _tcpServer.Start();

        var _tcpClient = new WatsonTcpClient("localhost", 9956);
        _tcpClient.Events.MessageReceived += OnMessage;
        _tcpClient.Connect();

        Task.Run(() =>
        {
            int i = 0;
            while (i < 1000)
            {
                _tcpClient.Send(clientData);
                i++;
            }
        });

        while (true)
        {
            if (_all)
                break;
            Thread.Sleep(100);
        }

        Console.WriteLine($"_countReceivedMessages {_countReceivedMessages}");
    }

    private static void OnMessage(object sender, MessageReceivedEventArgs e)
    {
        if (Interlocked.Add(ref _countReceivedMessages, 1) > 999) _all = true;
    }

    private static void OnTcpMessageReceived(object sender, MessageReceivedEventArgs e)
    {
        _tcpServer.Send(_clientGuid, _serverData);
    }

    private static void OnClientConnected(object sender, ConnectionEventArgs e)
    {
        _clientGuid = e.Client.Guid;
    }
}

server3304.txt
client2834.txt

Sometimes this code works to the end, and sometimes it freezes. If you stop the debugger at the moment of hanging, then the _countReceivedMessages variable is equal to the value 854, and if you continue execution, the value does not change.
Perhaps this issue affects this theRainbird/CoreRemoting#44
version of WatsonTCP is 5.1.0

In test attached at theRainbird/CoreRemoting#44 most of the time is spent on
read = await stream.ReadAsync(buffer, 0, buffer.Length);
in method
msgData = await WatsonCommon.ReadFromStreamAsync(msg.DataStream, msg.ContentLength, bufferLen);
On my machine(net6 on ubuntu 22.04), the whole test takes about 120 seconds. And this method takes about 80 seconds.
It looks like a problem with the slow operation of the ReadAsync method of the NetworkStream class on Linux.
On my machine(net6 on windows 10 x64), the whole test takes about 50 seconds and ReadAsync method takes 0 seconds.
The calculation of the time spent was done as follows

                        var ts = Stopwatch.GetTimestamp();

                        read = await stream.ReadAsync(buffer, 0, buffer.Length);

                        var time = Stopwatch.GetTimestamp() - ts;
                        Interlocked.Add(ref _totalReadtime, time * 1000/Stopwatch.Frequency);
                        Console.WriteLine("TotalReadMessageDataAsync " + _totalReadtime);

Deadlock is gone when I set _tcpServer.Settings.NoDelay = true; and _tcpClient.Settings.NoDelay = true;

Hi @sancheolz I'm sorry I haven't had a chance to respond to this. Thank you for letting me know. I need to go back and re-evaluate whether or not Settings.NoDelay should be true by default. Since WatsonTcp is framing, I can't think of a good reason why NoDelay shouldn't be true by default. Out of curiosity, how long does the test take after the change? The issue you mentioned said it was taking 800 seconds. How is it now?

ubuntu websocket net6 run				Elapsed ms   4 217
ubuntu watsontcp default net6 run			Elapsed ms 887 276
ubuntu watsontcp nodelay = true net6 run		Elapsed ms  12 608

windows websocket net6 run				Elapsed ms   3 715
windows watsontcp default net6 run			Elapsed ms   6 900
windows watsontcp nodelay = true net6 run		Elapsed ms   7 309

Hi @jchristn If you interesting, I've posted issue at dotnet about this dotnet/runtime#90664

Thank you @sancheolz i will update the library tomorrow to change the default behavior!

Here it is. Thanks again @sancheolz
NuGet: https://www.nuget.org/packages/WatsonTcp/5.1.7
Commit: d433347