webdevops/go-crond

Defunct/zombie processes accumulates under go-crond

eguaj opened this issue · 3 comments

eguaj commented

Hi,

I made a container version of an app (which I have minimal control over), and I implemented the cron part using go-crond, but at runtime I'm observing an accumulation of processes in the "defunct" state :

dev      2530591  0.0  0.3 722320 12440 ?        Sl   mars12   3:26  \_ /home/dev/bin/containerd-shim-runc-v2 -namespace moby -id 5571b9c2af06289520b43579d570dd83a766304c27c9c3019e07f8e2f16ba19f -address /run/user/1001/docker/containerd/containerd.sock                                                                  
dev      2530636  0.0  0.2 1234760 8688 ?        Ssl  mars12   0:40  |   \_ /usr/local/bin/go-crond -v www-data:/var/spool/cron/crontabs/www-data                                                                                                                                                                             
165568   2750043  0.0  0.0      0     0 ?        Z    mars12   0:00  |       \_ [exec65f0de702] <defunct>
165568   2823765  0.0  0.0      0     0 ?        Z    mars13   0:00  |       \_ [exec65f12a9c1] <defunct>
165568   2824042  0.0  0.0      0     0 ?        Z    mars13   0:00  |       \_ [exec65f12ad82] <defunct>
[...]
165568   2897261  0.0  0.0      0     0 ?        Z    04:24   0:00  |       \_ [exec66024e1c2] <defunct> 
165568   2897559  0.0  0.0      0     0 ?        Z    04:25   0:00  |       \_ [exec66024e581] <defunct> 
165568   2897853  0.0  0.0      0     0 ?        Z    04:26   0:00  |       \_ [exec66024e942] <defunct>

I do not observe this behavior when the application is running on a "real" server (e.g. using anacron).

The problems seems to be related to the fact that the process launched by the crontab (which I have no control over) seems to use a shell & construct to "fork" himself in the background, and somehow this leaves the process in a zombie state when running under go-crond.

Here is a striped down test case that replicates the problem.

  • Create a crontab that performs a shell background execution:
cat <<'EOF' > crontab
*/1 * * * * sh -c 'sleep 2 &'
EOF

chmod 0600 crontab
  • Run the crontab in a container using go-crond:
docker run --rm -it --volume ./crontab:/tmp/crontab webdevops/go-crond:debian -v root:/tmp/crontab
  • While the go-crond container is running, observe that "defunct" processes are created and accumulates over time:
watch "ps auxwww | grep -P '\\bdefunct'"

We see that defunct/zombie processes accumulates over time:

root       71596  0.0  0.0      0     0 pts/0    Z+   11:59   0:00 [sleep] <defunct>
root       71766  0.0  0.0      0     0 pts/0    Z+   12:00   0:00 [sleep] <defunct>
[...]
smlx commented

go-crond won't reap zombies. You'll need to use something like tini as PID 1 in the container, and run go-crond under that.

eguaj commented

Thanks, that was also my conclusion, that go-crond do not automatically collects and reaps dead childs.

I did not know about tini, but I'll try running go-crond with it.

I know nothing at Go programming, but I experimented with using https://github.com/hashicorp/go-reap in runner.go, and it seems to prevent defunct/zombie processes from appearing (but I don't know if it could have some drawbacks or unwanted side-effects on the long run):

diff --git a/runner.go b/runner.go
index dec2af9..dbf1b71 100644
--- a/runner.go
+++ b/runner.go
@@ -11,6 +11,7 @@ import (
        "github.com/prometheus/client_golang/prometheus"
        cron "github.com/robfig/cron/v3"
        log "github.com/sirupsen/logrus"
+       reap "github.com/hashicorp/go-reap"
 )
 
 type Runner struct {
@@ -107,6 +108,7 @@ func (r *Runner) Len() int {
 // Start runner
 func (r *Runner) Start() {
        log.Infof("start runner with %d jobs\n", r.Len())
+       go reap.ReapChildren(nil, nil, nil, nil)
        r.cron.Start()
        r.initAllCronEntryMetrics()
 }
eguaj commented

As mentioned on the tini project, it is now natively integrated in Docker and docker-compose through the init property : https://docs.docker.com/compose/compose-file/compose-file-v2/#init

I've modified my Docker compose file to add a init: true directive, and it solves my problem of defunct processes accumulating over time.