Tendrl/tendrl-ansible

carbon: exceptions.IOError: [Errno 24] Too many open files

nthomas-redhat opened this issue · 3 comments

On bigger clusters, carbon-cache is repeatedly failing with long
traceback in /var/log/carbon/console.log with following error:
exceptions.IOError: [Errno 24] Too many open files: ...

This caused subsequent issues in collectd on Gluster storage servers:
collectd: error: [Errno 110] Connection timed out
(from /var/log/messages log)

The number of files carbon-cache.py can open will need to be increased. Many systems default to a max of 1024 file descriptors. Propose to increase this to 16384, which might good enough for now based on the size of the cluster we support at the moment.

For more information:
http://graphite.readthedocs.io/en/latest/carbon-daemons.html
https://kadirsert.blogspot.in/2015/05/graphite-carbon-cache-ioerror-with-too.html
https://bugzilla.redhat.com/show_bug.cgi?id=1560875

Reopening, as the previous fix is not addressing the problem.

Let's explore option of tweaking this in systemd service file stored in /etc, which could override the unit file which comes from rpm package.

This is expected to be fixed via: