idena-network/idena-go

Process is leaking file handles on Linux

Opened this issue · 1 comments

I have now twice received the "Too many open files" error from idena-go. Both times I just increased the limit, but even 65535 was not high enough (I now have the limit at 1048576).

No process should be using that many file handles. So I investigating the open files:

lsof -n | awk '{ print $2; }' | sort -rn | uniq -c | sort -rn | head -1

This returned the following counts during about first 30 minutes of idena-go run:

   5707 11409
   8008 11409
  29782 11409
  23453 11409
  18220 11409

11409 is the process id of the idena-go process.

Examining the lsof -n output more closely, I found it contains thousands of open IPv4 sockets. For example:

exec-iden 11409                  mikko  436u     IPv4             869981       0t0        TCP xxx.xxx.xxx.xxx:40405->130.61.245.64:40407 (ESTABLISHED)
exec-iden 11409 11410            mikko  436u     IPv4             869981       0t0        TCP xxx.xxx.xxx.xxx:40405->130.61.245.64:40407 (ESTABLISHED)
exec-iden 11409 11411            mikko  436u     IPv4             869981       0t0        TCP xxx.xxx.xxx.xxx:40405->130.61.245.64:40407 (ESTABLISHED)
exec-iden 11409 11412            mikko  436u     IPv4             869981       0t0        TCP xxx.xxx.xxx.xxx:40405->130.61.245.64:40407 (ESTABLISHED)
exec-iden 11409 11413            mikko  436u     IPv4             869981       0t0        TCP xxx.xxx.xxx.xxx:40405->130.61.245.64:40407 (ESTABLISHED)
exec-iden 11409 11414            mikko  436u     IPv4             869981       0t0        TCP xxx.xxx.xxx.xxx:40405->130.61.245.64:40407 (ESTABLISHED)
exec-iden 11409 11415            mikko  436u     IPv4             869981       0t0        TCP xxx.xxx.xxx.xxx:40405->130.61.245.64:40407 (ESTABLISHED)
exec-iden 11409 11416            mikko  436u     IPv4             869981       0t0        TCP xxx.xxx.xxx.xxx:40405->130.61.245.64:40407 (ESTABLISHED)
exec-iden 11409 11417            mikko  436u     IPv4             869981       0t0        TCP xxx.xxx.xxx.xxx:40405->130.61.245.64:40407 (ESTABLISHED)
exec-iden 11409 11419            mikko  436u     IPv4             869981       0t0        TCP xxx.xxx.xxx.xxx:40405->130.61.245.64:40407 (ESTABLISHED)
exec-iden 11409 11434            mikko  436u     IPv4             869981       0t0        TCP xxx.xxx.xxx.xxx:40405->130.61.245.64:40407 (ESTABLISHED)
exec-iden 11409 11445            mikko  436u     IPv4             869981       0t0        TCP xxx.xxx.xxx.xxx:40405->130.61.245.64:40407 (ESTABLISHED)
exec-iden 11409 11469            mikko  436u     IPv4             869981       0t0        TCP xxx.xxx.xxx.xxx:40405->130.61.245.64:40407 (ESTABLISHED)

Here I filtered for only a single target IP address, but there are hundreds or thousands of IP addresses like this. I cannot find the listed process id's (11410 and above) on my system.

At the same time, idena-go log doesn't report an excessive number of connections, and total-peers=12 own-shard-peers=9.

My Idena-go version is 0.27.2. I am running a Debian 9.13 (stretch) system on a VPS with 3 vCPU's and 4 GB RAM. Kernel version is 4.9.0-13-amd64 #1 SMP Debian 4.9.228-1 (2020-07-05) x86_64 GNU/Linux

Ever since posting the above, I kept getting "Too many open files" errors from the Idena process, about once every week, with higher probability near validations.

It turned out that my setting ulimit in /etc/security/limits.conf had NOT taken effect. Checking /proc/****/limits, the Idena process limits were still at 4096. I am running the Idena process under systemd, and setting the limits in the .service file fixed it:

LimitNOFILE=65536
LimitNOFILESoft=65536

Two things learned from all this:

  1. Check /proc/****/limits for the actual in-effect limits.

  2. Whatever lsof shows are NOT "real" open files that count to this limit. After all, lsof consistently showed tens of thousands of entries for my Idena process, yet it kept most of the time under the in-effect limit of 4096 files.