NagiosEnterprises/ncpa

Large number of zombie proccesses not reaped by NCPA

Opened this issue · 4 comments

I had an instance of ncpa that got into a state where it had a large amount of zombie processes that were not being reaped.

image

I've attached some logs, which seem to suggest some python threads not exiting properly.
ncpa-logs.txt

NCPA version is 3.1.1
OS is Ubuntu 22.04.5 LTS.

Thank you for reporting this. I'll begin investigating as soon as I can.

We have also had two ubuntu hosts have the same, we got to about 200+ on both boxes, stopping the service didn't fix, and had to reboot the vms. Only recently started happening.

We've fixed this issue by downgrading to 3.1.0, so it looks like this was introduced in 3.1.1.

Do you have any checks running scripts, we removed a check that was running a script to check postfix and so far haven't had it occur since. So if you are running a script as one of the checks that might slim down where the issue was introduced in 3.1.1