jplana/python-etcd

Stale connection

Opened this issue · 0 comments

I'm having an issue with a client connection when using wait=True. I have the following call

output = client.read("foo/{0}".format(bar), wait=True, timeout=0, recursive=True)

I have machines checking into foo/bar/fqdn every 30 minutes or so to refresh a ttl which does not trigger the the watcher and every so often a new machine connects or expires the ttl forcing a reload. Everything works fine with my short duration tests, but longer term it seems the connection gets stale and remains active, but never receives an update from etcd. This results in a stalled script where etcd may send updates to that directory, but my python script never receives it and that becomes a fatal problem for scripts that are watching a directory for live configuration data.

I'm trying to repeat the process now, but have not been successful in doing so. I've currently run a test at 12 hours and then at 2 days and neither resulted in a stale connection.

My two questions are the following. Are you aware of any issues that might result in this behavior? And secondly, should I set a timeout on waits to avoid stale connections or should I be able to expect a watcher to watch indefinitely?

Thanks,