mcuadros/ofelia

Cron jobs are not signaled SIGTERM on terminated and jobs are still started while waiting for termination

Tofandel opened this issue · 0 comments

When stopping an ofelia container, the running jobs do not receive a SIGTERM signal which would tell them the container is about to stop, instead it just waits for them to finish normally (which can take an undefined amount of time) then 10 seconds later docker SIGKILLs the ofelia container which prevents them from shutting down correctly and doing the cleanup that they should be doing, this is a problem for very long jobs

A SIGTERM would normally tell those jobs that they are about to stop and that they should save their current state, they otherwise loose the progress they made

Also while waiting for jobs to stop, new jobs are being picked up, which would also get killed without due process

2023-07-05 01:58:08 2023-07-04T23:58:08.141Z  daemon.go:69 ▶ WARNING Signal received: terminated, shutting down the process
2023-07-05 01:58:08 2023-07-04T23:58:08.141Z  daemon.go:83 ▶ WARNING Waiting running jobs.
2023-07-05 01:58:16 2023-07-04T23:58:16.004Z  common.go:125 ▶ NOTICE [Job "jobqueue" (3982f5ca83d2)] Started - /var/www/html/bin/console jms-job-queue:run -vv -r 300
2023-07-05 01:58:16 2023-07-04T23:58:16.016Z  common.go:125 ▶ NOTICE [Job "mailqueue" (c7855fbff257)] Started - /var/www/html/bin/console swiftmailer:spool:send --mailer=queue --message-limit=30
2023-07-05 01:58:16 2023-07-04T23:58:16.441Z  common.go:125 ▶ NOTICE [Job "mailqueue" (c7855fbff257)] StdOut: 
2023-07-05 01:58:16  [2023-07-04 23:58:16] Processing queue mailer spool... 
2023-07-05 01:58:16  0 emails sent
2023-07-05 01:58:16 2023-07-04T23:58:16.441Z  common.go:125 ▶ NOTICE [Job "mailqueue" (c7855fbff257)] Finished in "425.510373ms", failed: false, skipped: false, error: none

PS: While the docker API currently does not support killing an exec, we can still create a new exec with the command kill -n 9 $execId, I would attempt a PR, but I have never touched go before