Lonami/grammers

Infinite reconnect after a while

ShaykinAnton opened this issue · 6 comments

Hi! Thanks for your library, I'm new to rust and I have a problem with your library. I simultaneously launch 1 bot and 1 client and after a while getting messages like:
received an update referring to an unknown peer, but cannot find out who
and
received an update referencing an unknown peer, treating as gap
then after a few hours the client stops processing messages and starts spamming

2024-01-05 12:09:10,674 /Users/anton/.cargo/git/checkouts/grammers-689e30b82f69dcd5/04d9577/lib/grammers-mtsender/src/lib.rs INFO: retrying the call
2024-01-05 12:09:10,724 /Users/anton/.cargo/git/checkouts/grammers-689e30b82f69dcd5/04d9577/lib/grammers-mtsender/src/lib.rs INFO: connecting...
2024-01-05 12:09:10,769 /Users/anton/.cargo/git/checkouts/grammers-689e30b82f69dcd5/04d9577/lib/grammers-mtsender/src/lib.rs INFO: retrying the call
2024-01-05 12:09:10,815 /Users/anton/.cargo/git/checkouts/grammers-689e30b82f69dcd5/04d9577/lib/grammers-mtsender/src/lib.rs INFO: connecting...
2024-01-05 12:09:10,860 /Users/anton/.cargo/git/checkouts/grammers-689e30b82f69dcd5/04d9577/lib/grammers-mtsender/src/lib.rs INFO: retrying the call
2024-01-05 12:09:10,905 /Users/anton/.cargo/git/checkouts/grammers-689e30b82f69dcd5/04d9577/lib/grammers-mtsender/src/lib.rs INFO: connecting...
2024-01-05 12:09:10,951 /Users/anton/.cargo/git/checkouts/grammers-689e30b82f69dcd5/04d9577/lib/grammers-mtsender/src/lib.rs INFO: retrying the call
2024-01-05 12:09:10,998 /Users/anton/.cargo/git/checkouts/grammers-689e30b82f69dcd5/04d9577/lib/grammers-mtsender/src/lib.rs INFO: connecting...
2024-01-05 12:09:11,042 /Users/anton/.cargo/git/checkouts/grammers-689e30b82f69dcd5/04d9577/lib/grammers-mtsender/src/lib.rs INFO: retrying the call
2024-01-05 12:09:11,086 /Users/anton/.cargo/git/checkouts/grammers-689e30b82f69dcd5/04d9577/lib/grammers-mtsender/src/lib.rs INFO: connecting...
2024-01-05 12:09:11,133 /Users/anton/.cargo/git/checkouts/grammers-689e30b82f69dcd5/04d9577/lib/grammers-mtsender/src/lib.rs INFO: retrying the call

and so on until the application is restarted. This even ignores ReconnectionPolicy.

I don't think I'll be able to look into this any time soon. You could clone the project and try to remove that logic, see if it helps.

Faced the same issue. I forked the lib and tried to drop the reconnect logic, so bot will crash instead. It is automatically restarted by systemd. Curiously this disconnects happen at ~20 minute of the hour, for some reason. Here's the stats from Jan 08:

00:20:36 systemd[1]: tg-listener.service: Failed with result 'exit-code'.
06:20:56 systemd[1]: tg-listener.service: Failed with result 'exit-code'.
07:20:17 systemd[1]: tg-listener.service: Failed with result 'exit-code'.
11:20:37 systemd[1]: tg-listener.service: Failed with result 'exit-code'.
19:20:35 systemd[1]: tg-listener.service: Failed with result 'exit-code'.
20:20:55 systemd[1]: tg-listener.service: Failed with result 'exit-code'.
23:20:39 systemd[1]: tg-listener.service: Failed with result 'exit-code'.
08:21:01 systemd[1]: tg-listener.service: Failed with result 'exit-code'.

So it is not periodic, but minute 20 all the time 🤯
And looks like before this disconnect there is always log line like this: grammers_mtsender: got rpc result MsgId(7321794133237723592) but no such request is saved

Any progress on this? I think I have the same issue. Client seems to disconnect and spams the logs with this:
[2024-05-11T22:46:04Z DEBUG grammers_mtsender] serialized request f3427b8c (ping_delay_disconnect) with MsgId(7367877086501107668) [2024-05-11T22:46:04Z DEBUG grammers_mtsender] sent request with MsgId(7367877086495227884)

Yeah I think this is the same as #237 Sorry I missed that issue before replying to the other. But there is definitely something going on here.

9c8d738 should fix a pretty serious bug (after an error, when the state is reset, the read buffer was always empty, so any read attempts into the empty buffer would always succeed, leading to very quick looping). Assuming this was the problem, I'll close this issue.

And if not, we have the previously-linked issue to discuss this on.