liske/needrestart

needrestart very slow when munin services are running

saffroy opened this issue · 1 comments

Hello,

I found out that needrestart is usually rather slow on my machine, and it gets much better when I stop all munin services (including apache which spawns CGIs running munin code). Munin is a monitoring system written in Perl.

# time needrestart -v -r l -l &> /dev/null 

real    0m10.530s
user    0m9.823s
sys     0m0.678s

# systemctl stop munin munin-node.service apache2.service 

# time needrestart -v -r l -l &> /dev/null 

real    0m1.783s
user    0m1.490s
sys     0m0.290s

The following lines suggest where time is spent:

# time needrestart -v -r l -l 
...
[Core] #962092 is a NeedRestart::Interp::Perl
[Perl] #962092: source=/usr/sbin/munin-node
[Core] #964197 is a NeedRestart::Interp::Perl
[Perl] #964197: source=/usr/lib/munin/cgi/munin-cgi-graph
...

This is on Debian 12 (bookworm) with packaged versions of both needrestart and munin.

Is this expected behaviour?

Hmmm, digging with a profiler (NYTProf) led me to scandeps, and I suppose it is to be expected.

Profiler annotation:

$href = scan_deps(
        # spent  26.0s making 2 calls to [Module::ScanDeps::scan_deps], avg 13.0s/call

Running the scandeps code from the command line:

$ time scandeps /usr/lib/munin/cgi/munin-cgi-graph |wc -l
# Use of runtime loader module Module::Runtime detected.  Results of static scanning may be incomplete.
# Use of runtime loader module Module::Implementation detected.  Results of static scanning may be incomplete.
1217

real    0m5.596s
user    0m5.375s
sys     0m0.217s

The scandeps command can use a cache file though, and it seems to help quite a bit:

$ rm cache 

# empty cache
$ time scandeps -C cache /usr/lib/munin/cgi/munin-cgi-graph &> /dev/null 

real    0m5.757s
user    0m5.521s
sys     0m0.216s

# warm cache
$ time scandeps -C cache /usr/lib/munin/cgi/munin-cgi-graph &> /dev/null 

real    0m1.024s
user    0m0.786s
sys     0m0.236s

Maybe it would make sense for needrestart to also keep such a cache, as it will often be rescanning the same things? Apparently it's a parameter of the scan_deps function.