oshied/directord

client crashes when attempting reconnect

Closed this issue · 1 comments

Describe the bug
The server was unavailable due to iptables and the client process failed with:

Aug 18 17:17:26 overcloud-controller-2 directord[18220]: ERROR Heartbeat failure, can't reach server
Aug 18 17:17:26 overcloud-controller-2 directord[18220]: WARNING Reconnecting in [ 2 ]...
Aug 18 17:17:28 overcloud-controller-2 directord[18220]: DEBUG Running reconnection.
Aug 18 17:17:28 overcloud-controller-2 directord[18220]: Process Process-1:
Aug 18 17:17:28 overcloud-controller-2 directord[18220]: Traceback (most recent call last):
Aug 18 17:17:28 overcloud-controller-2 directord[18220]:   File "/usr/lib64/python3.8/multiprocessing/process.py", line 315, in _bootstrap
Aug 18 17:17:28 overcloud-controller-2 directord[18220]:     self.run()
Aug 18 17:17:28 overcloud-controller-2 directord[18220]:   File "/usr/lib64/python3.8/multiprocessing/process.py", line 108, in run
Aug 18 17:17:28 overcloud-controller-2 directord[18220]:     self._target(*self._args, **self._kwargs)
Aug 18 17:17:28 overcloud-controller-2 directord[18220]:   File "/opt/directord/lib64/python3.8/site-packages/directord/client.py", line 128, in run_>
Aug 18 17:17:28 overcloud-controller-2 directord[18220]:     self.driver.heartbeat_reset()
Aug 18 17:17:28 overcloud-controller-2 directord[18220]:   File "/opt/directord/lib64/python3.8/site-packages/directord/drivers/zmq.py", line 444, in>
Aug 18 17:17:28 overcloud-controller-2 directord[18220]:     self.poller.unregister(self.bind_heatbeat)
Aug 18 17:17:28 overcloud-controller-2 directord[18220]: AttributeError: 'Driver' object has no attribute 'bind_heatbeat'

To Reproduce
Steps to reproduce the behavior:

  1. The directord server needs to be on a node with iptables restricting network access
  2. Run bootstrap
  3. Clients install but cannot connect
  4. See error

Expected behavior
reconnect should continue

This should now be resolved with the following commit: 85d9a0e