During heavy load, ES times out and processing halts
Closed this issue · 2 comments
punisherVX commented
When ES is under heavy load or pauses due to garbage collection, the default timeout (10s) is not enough and when it times out all SFN processing halts and never starts again. See stack trace:
GET http://localhost:9200/threat-*/_search [status:N/A request:10.018s]
Traceback (most recent call last):
File "/home/ubuntu/safe-networking/sfn-env/lib/python3.6/site-packages/urllib3/connectionpool.py", line 387, in _make_request
six.raise_from(e, None)
File "<string>", line 2, in raise_from
File "/home/ubuntu/safe-networking/sfn-env/lib/python3.6/site-packages/urllib3/connectionpool.py", line 383, in _make_request
httplib_response = conn.getresponse()
File "/usr/lib/python3.6/http/client.py", line 1331, in getresponse
response.begin()
File "/usr/lib/python3.6/http/client.py", line 297, in begin
version, status, reason = self._read_status()
File "/usr/lib/python3.6/http/client.py", line 258, in _read_status
line = str(self.fp.readline(_MAXLINE + 1), "iso-8859-1")
File "/usr/lib/python3.6/socket.py", line 586, in readinto
return self._sock.recv_into(b)
socket.timeout: timed out
punisherVX commented
There are a few ways to try and fix this. Set timeout to more than 10s (as documented here ) or figure out why GC is taking so long and fix that.
punisherVX commented
This is fixed in fbc36f0