jweigelt/swbf2admin

Rcon packet timeout happening at any time

Safraann opened this issue · 7 comments

Hi guys,
First of all congratulations and thanks for this awesome admin tool. But since the beginning I'm encountering rcon packet timeouts at any random time making web panel informations no longer updated. I mean one day I can start the server and these constant timeouts problem will occur only the evening, maybe the next day I will face the problem just after the server restarted.
To be clearer the server is working pretty fine, but these rcon packet timeouts happen randomly and often needs a server restart to disappear. Maybe if I increase the RconClient timeout constant this will decrease, I dunna know, do you guys have this kinda problem, do you have any solution? Thanks in advance!

Hey,

packet timeouts are not unknown for the gog server executable.
I have to admit I have not debugged this as much as I probably should have. The rcon support is a bit hacked in for the new game versions as there is no native support. (The most likely explanation is a concurrency issue with native swbf2 code.)
I agree that packet timeouts of any form are usually not acceptable on a local connection.

This happened in particular when the server was empty. (the server being empty used to cause all kinds of problems with server connectivity and such) It's probably safe to ignore the timeouts if everything still works as it should.

Maybe you could test whether packets are also dropped with players on and if the entire rcon connection gets stuck.

You can test whether rcon is completely stuck by sending a few /admin commands using the web chat. (/admin /status would be a good choice here)

You could also try to increase the timeout, although 500ms is already a rather liberal choice.
In fact I'll probably move this setting into an XML.

There was also an update a release or two ago that patched a rcon concurrency bug that could crash the server. If you're not using the latest build already (or head of master) make sure to update.

Hi, thanks for your reactivity! The admin commands executed since the chat doesnt help in unfreezing the repetitive packet timeouts.
So the last option is trying to increase a little the timeout constant, maybd 700 ms, will check that tuesday. Thanks!

Hi, everytime I increase the packet timeout constant, currently it's set to 750ms, everytime the rcon server indicates me values like 750.50ms or even 780ms and then it causes the timeouts. I mean I'm pretty sure any value won't work, the error stays somewhere else. Have you any ideas please? Thanks in advance!

I mean when I log (DateTime.Now - start).TotalMilliseconds it gives me out values like the constant but a little bit more milliseconds.

This is likely an issue with native bf2 code that is getting called by the RconServer DLL.

As there is no native rcon support, this functionality is archieved by injecting a dll at runtime that calls the server's chat handling function (see https://github.com/jweigelt/swbf2admin/blob/master/RconServer/bf2server.cpp#L197).

My best guess would be that the chat handling function sometimes fails and returns an empty string. This results in (https://github.com/jweigelt/swbf2admin/blob/master/RconServer/RconClient.cpp#L54) sending an empty reply.

Different timeout values are expected. This is related to the nature of the sleep function.

Hi, do you suggest me checking if the string returned by bf2server_command is not empty right? Do you have any other ideas to stop these timeouts? I mean Im not rlly sure trying other timeouts values would be the good choice. Thanks in advance!

I think packets getting "lost" due to empty bf2server_command() responses is the most likely explanation. This is essentially just a guess though. A good way to work around this would be to add code that detects whether bf2server_command() executed succesfully and report eventual errors back to the client.

The "proper" way to fix this would be to further dig into the server binary. As the timeouts occur somewhat randomly this would be very tedious and wont nescessarily lead to success.

I agree that different timeout values will most likely not fix the problem.

It turns out the bug is apparently not caused by the server dll and should be fixed in 5775a19.

If you're interested in the bug (it's good one):
It happens because "lastMessage" (which is used to store the latest command response) is cleared after sending a command. (see https://github.com/jweigelt/swbf2admin/blob/master/SWBF2Admin/Runtime/Rcon/RconClient.cs#L190)
Sometimes the response is received so fast that lastMessage is updated between sending the command and clearing it. This causes the command response to be cleared before it can be evaluated effectively dropping the server's response.