Defunct/zombie processes accumulates under go-crond
eguaj opened this issue · 3 comments
Hi,
I made a container version of an app (which I have minimal control over), and I implemented the cron part using go-crond
, but at runtime I'm observing an accumulation of processes in the "defunct" state :
dev 2530591 0.0 0.3 722320 12440 ? Sl mars12 3:26 \_ /home/dev/bin/containerd-shim-runc-v2 -namespace moby -id 5571b9c2af06289520b43579d570dd83a766304c27c9c3019e07f8e2f16ba19f -address /run/user/1001/docker/containerd/containerd.sock
dev 2530636 0.0 0.2 1234760 8688 ? Ssl mars12 0:40 | \_ /usr/local/bin/go-crond -v www-data:/var/spool/cron/crontabs/www-data
165568 2750043 0.0 0.0 0 0 ? Z mars12 0:00 | \_ [exec65f0de702] <defunct>
165568 2823765 0.0 0.0 0 0 ? Z mars13 0:00 | \_ [exec65f12a9c1] <defunct>
165568 2824042 0.0 0.0 0 0 ? Z mars13 0:00 | \_ [exec65f12ad82] <defunct>
[...]
165568 2897261 0.0 0.0 0 0 ? Z 04:24 0:00 | \_ [exec66024e1c2] <defunct>
165568 2897559 0.0 0.0 0 0 ? Z 04:25 0:00 | \_ [exec66024e581] <defunct>
165568 2897853 0.0 0.0 0 0 ? Z 04:26 0:00 | \_ [exec66024e942] <defunct>
I do not observe this behavior when the application is running on a "real" server (e.g. using anacron).
The problems seems to be related to the fact that the process launched by the crontab (which I have no control over) seems to use a shell &
construct to "fork" himself in the background, and somehow this leaves the process in a zombie state when running under go-crond
.
Here is a striped down test case that replicates the problem.
- Create a crontab that performs a shell background execution:
cat <<'EOF' > crontab
*/1 * * * * sh -c 'sleep 2 &'
EOF
chmod 0600 crontab
- Run the crontab in a container using
go-crond
:
docker run --rm -it --volume ./crontab:/tmp/crontab webdevops/go-crond:debian -v root:/tmp/crontab
- While the go-crond container is running, observe that "defunct" processes are created and accumulates over time:
watch "ps auxwww | grep -P '\\bdefunct'"
We see that defunct/zombie processes accumulates over time:
root 71596 0.0 0.0 0 0 pts/0 Z+ 11:59 0:00 [sleep] <defunct>
root 71766 0.0 0.0 0 0 pts/0 Z+ 12:00 0:00 [sleep] <defunct>
[...]
go-crond won't reap zombies. You'll need to use something like tini as PID 1 in the container, and run go-crond under that.
Thanks, that was also my conclusion, that go-crond
do not automatically collects and reaps dead childs.
I did not know about tini
, but I'll try running go-crond
with it.
I know nothing at Go programming, but I experimented with using https://github.com/hashicorp/go-reap in runner.go
, and it seems to prevent defunct/zombie processes from appearing (but I don't know if it could have some drawbacks or unwanted side-effects on the long run):
diff --git a/runner.go b/runner.go
index dec2af9..dbf1b71 100644
--- a/runner.go
+++ b/runner.go
@@ -11,6 +11,7 @@ import (
"github.com/prometheus/client_golang/prometheus"
cron "github.com/robfig/cron/v3"
log "github.com/sirupsen/logrus"
+ reap "github.com/hashicorp/go-reap"
)
type Runner struct {
@@ -107,6 +108,7 @@ func (r *Runner) Len() int {
// Start runner
func (r *Runner) Start() {
log.Infof("start runner with %d jobs\n", r.Len())
+ go reap.ReapChildren(nil, nil, nil, nil)
r.cron.Start()
r.initAllCronEntryMetrics()
}
As mentioned on the tini
project, it is now natively integrated in Docker and docker-compose through the init
property : https://docs.docker.com/compose/compose-file/compose-file-v2/#init
I've modified my Docker compose file to add a init: true
directive, and it solves my problem of defunct
processes accumulating over time.