alexreinert/homematic_check_mk

server.tcl creates tons of zombie processes

Closed this issue · 3 comments

After having installed your Check_MK Add-on on RaspberryMatic and running it for a while I was notified by Check_MK that I seem to have a bunch of tclsh zombie process is generated:

# ps -o stat,pid,ppid,user,group,comm,args | grep ^Z | head -10
Z      302 22525 root     root     tclsh            [tclsh]
Z      303 22525 root     root     tclsh            [tclsh]
Z      312 22525 root     root     tclsh            [tclsh]
Z      350 22525 root     root     tclsh            [tclsh]
Z      362 22525 root     root     tclsh            [tclsh]
Z      386 22525 root     root     tclsh            [tclsh]
Z      389 22525 root     root     tclsh            [tclsh]
Z      391 22525 root     root     tclsh            [tclsh]
Z      429 22525 root     root     tclsh            [tclsh]
Z      451 22525 root     root     tclsh            [tclsh]
# ps -o stat,pid,ppid,user,group,comm,args | grep ^Z | wc -l
1827
# ps aux | grep 22525
22525 root      0:15 tclsh /usr/local/addons/check_mk_agent/server.tcl

As you can see the PPID of these tclsh processes is pointing to server.tcl. In addition, I have already around 1800 zombie processes running here. Running /etc/config/rc.d/check_mk_agent restart clears these zombies immediately.

I haven't debugged this further yet, but after having restarted the check_mk_agent once, I can't see these zombies anymore. I will try to monitor this for a longer while and see if they reappear. Just wanted to let you know that there might still be some issues hided somewhere in server.tcl cleaning up its' child processes.

Maybe you can provide more information: Which RaspberryMatic version, was it your dev or your prod system, was the rega running all the time, ...

The server.tcl does not create child tclsh processes by itself. It only create some child processes for some linux commands. The tcp server itself could create forks, but as the complete connection handler is in a catch block, there are no chances, that a zombie can be created (according to the tcl docs). The only way for zombies I see, is the xmlrpc or the rega_script blocks, but I'm not really aware, if they are using fork inside.

Please have a look at the following fora thread where some other user also reports a similar behavior:
https://homematic-forum.de/forum/viewtopic.php?f=65&t=46221&p=462567#p462557

Now that time has passed quite a while, I was actually never able to reproduce this issue again. Thus, I think it should be safe to close this ticket as invalid.