Watch dir: timeout error?
bs-sdev opened this issue · 1 comments
Hello everyone,
Actually I have 3 etcd containers on an Openstack tenant, each container in located on a different VM. One of these VMs as a public IP (10.X.X.X) so it can be connected from everywhere and the others have an IP in 192.X.X.X (private network)
So my aim is watching a directory (recursively) and, for any operation on the base, it returns the data modified (not difficult). To make it work, I run this command:
def watch_database(self):
while True:
watch_result = self.client.read(self.prefix, wait=True,
recursive=True)
logging.warn(watch_result)
So first it work, but if I wait too much, I have this:
(client) wrapper() Request to server http://10.50.0.216:2379 failed: timeout('timed out',)
(client) machines() Failed to get list of machines from http://192.168.0.4:4001/v2: ConnectTimeoutError(<urllib3.connectionpool.HTTPConnectionPool object at 0x7f1f081f8e90>, u'Connection to 192.168.0.4 timed out. (connect timeout=60)')
(client) machines() Failed to get list of machines from http://192.168.0.5:4001/v2: ConnectTimeoutError(<urllib3.connectionpool.HTTPConnectionPool object at 0x7f1f08209690>, u'Connection to 192.168.0.5 timed out. (connect timeout=60)')
(client) machines() Failed to get list of machines from http://192.168.0.5:2379/v2: ConnectTimeoutError(<urllib3.connectionpool.HTTPConnectionPool object at 0x7f1f08209890>, u'Connection to 192.168.0.5 timed out. (connect timeout=60)')
(client) machines() Failed to get list of machines from http://192.168.0.4:2379/v2: ConnectTimeoutError(<urllib3.connectionpool.HTTPConnectionPool object at 0x7f1f08209a10>, u'Connection to 192.168.0.4 timed out. (connect timeout=60)')
(client) machines() Failed to get list of machines from http://192.168.0.3:2379/v2: ConnectTimeoutError(<urllib3.connectionpool.HTTPConnectionPool object at 0x7f1f08209b90>, u'Connection to 192.168.0.3 timed out. (connect timeout=60)')
(client) machines() Failed to get list of machines from http://192.168.0.3:4001/v2: ConnectTimeoutError(<urllib3.connectionpool.HTTPConnectionPool object at 0x7f1f08209d10>, u'Connection to 192.168.0.3 timed out. (connect timeout=60)')
Traceback (most recent call last):
File "tester.py", line 183, in
File "tester.py", line 179, in main
main()
File "tester.py", line 145, in main
self.get_action(action_number)()
File "tester.py", line 151, in watch_database
recursive=True)
File "/usr/local/lib/python2.7/dist-packages/etcd/client.py", line 536, in read
timeout=timeout)
File "/usr/local/lib/python2.7/dist-packages/etcd/client.py", line 846, in wrapper
self._machines_cache = self.machines
File "/usr/local/lib/python2.7/dist-packages/etcd/client.py", line 299, in machines
return self.machines
File "/usr/local/lib/python2.7/dist-packages/etcd/client.py", line 299, in machines
return self.machines
File "/usr/local/lib/python2.7/dist-packages/etcd/client.py", line 299, in machines
return self.machines
File "/usr/local/lib/python2.7/dist-packages/etcd/client.py", line 299, in machines
return self.machines
File "/usr/local/lib/python2.7/dist-packages/etcd/client.py", line 299, in machines
return self.machines
File "/usr/local/lib/python2.7/dist-packages/etcd/client.py", line 301, in machines
raise etcd.EtcdException("Could not get the list of servers, "
etcd.EtcdException: Could not get the list of servers, maybe you provided the wrong host(s) to connect to?
So my question is this one: why do I lost my connection? I gave it the public IP and also asked him to wait and did not put any timeout in order it takes "None" instead and wait all the time, does someone have an idea??? :/
Regards.
This honestly looks like you have a server-side timeout, and then your servers advertise their listen urls to clients with the wrong IPs.
Is any of the servers running with --advertise-listen-url http://10.50.0.216:2379 ? It doesn't seem to be the case from what etcd is doing.