fraschetti/Octoslack

high CPU load caused by very quick reconnection attempts

Closed this issue · 5 comments

I have just updated from 1.6.2 to 1.7.0. The 1.6.2 version is working correctly.

With 1.7.0, octoprint.log is flooded with reconnection attempts, which causes the CPU load to spike and makes printing impossible:

2019-01-26 13:33:48,853 - octoprint.plugins.Octoslack - ERROR - Error Slack RTM API connection error (Exception): math range error
2019-01-26 13:33:48,893 - octoprint.plugins.Octoslack - ERROR - Error Slack RTM API connection error (Exception): math range error
2019-01-26 13:33:48,912 - octoprint.plugins.Octoslack - ERROR - Error Slack RTM API connection error (Exception): math range error
2019-01-26 13:33:48,913 - octoprint.plugins.Octoslack - ERROR - Error Slack RTM API connection error (Exception): math range error
2019-01-26 13:33:48,914 - octoprint.plugins.Octoslack - ERROR - Error Slack RTM API connection error (Exception): math range error
2019-01-26 13:33:48,914 - octoprint.plugins.Octoslack - ERROR - Error Slack RTM API connection error (Exception): math range error
2019-01-26 13:33:48,915 - octoprint.plugins.Octoslack - ERROR - Error Slack RTM API connection error (Exception): math range error
2019-01-26 13:33:48,945 - octoprint.plugins.Octoslack - ERROR - Error Slack RTM API connection error (Exception): math range error
2019-01-26 13:33:49,049 - octoprint.plugins.Octoslack - ERROR - Error Slack RTM API connection error (Exception): math range error
2019-01-26 13:33:49,050 - octoprint.plugins.Octoslack - ERROR - Error Slack RTM API connection error (Exception): math range error

Hi @hakan42

I've found where the issue is happening and have a bit of a fix but it's not completely obvious why that issue would have even happened in the first place. I've published a 1.7.1 which you can pick up (the OctoPrint plugin manager's Advanced section will let you force an update check instead of waiting the usual 24 hours) if you want to see if it fixes your issue. If you do have the time, I'd appreciate it if you could gather some diagnostic infomration before you upgrade.

  1. Enable DEBUG logging
  2. Reproduce the issue
  3. Send Octoprint.log

Instructions to enable DEBUG logging for Octoslack

  • Enable DEBUG logging
    Ensure the following has been added to /home/pi/.octoprint/logging.yaml
loggers:
  octoprint.plugins.Octoslack:
    level: DEBUG
  • Restart OctoPrint
sudo service octoprint restart

I installed 1.7.0 again on my machine and enabled debugging. The most obvious thing I can see is the following:

2019-01-27 15:05:12,082 - octoprint.plugins.Octoslack - DEBUG - Attempting to connect Slack RTM API (iteration=46)
2019-01-27 15:05:12,084 - octoprint.plugins.Octoslack - DEBUG - Sleeping for 300 seconds before attempting connection
2019-01-27 15:05:12,086 - octoprint.plugins.Octoslack - ERROR - Error Slack RTM API connection error (Exception): global name 'wait' is not defined

which runs in a very fast loop.

With 1.7.1, now I can't connect at all, the following is from an updated (and freshly started) octoprint:

2019-01-27 15:21:20,511 - octoprint.plugins.Octoslack - ERROR - Failed to reconnect via Slack RTM API
2019-01-27 15:21:20,512 - octoprint.plugins.Octoslack - DEBUG - Attempting to connect Slack RTM API (iteration=2)
2019-01-27 15:21:20,513 - octoprint.plugins.Octoslack - DEBUG - Sleeping for 20 seconds before attempting connection

I have to correct myself, sometimes the connections are successfull, but every few times, I get the "failed to reconnect". I cannot see any pattern yet.

Hi @hakan42
I've made a few more changes that I believe should help - prevent infinite loops & be resistant to additional failure modes.
These changes are baking on a RasPi and Windows install on OctoPrint before I push a new release.
I'll reply here once I have more information to share.

Hi @hakan42
I've realeased 1.7.2 which contains a fix for the reconnection delay logic and some additional logic to prevent getting stuck in error/exception loops.
Please feel free to reopen this issue if you see the problem again.