astronomer/ap-airflow

clean_airflow_logs.sh does not handle SIGTERM properly

seifertm opened this issue · 1 comments

The script clean-airflow-logs.sh performs log trimming inside an infinite while loop. The loop uses sleep to wait until the next scheduled cleanup and exits when SIGINT or SIGTERM is recieved. However, Bash does not execute traps until the current command has finished. [0] That means, when SIGTERM arrives, the signal handler will not be called until sleep has finished.

When this scripts runs as a container in a Kubernetes pod this behavior causes potentially large delays when terminating the Pod. Kubernetes sends SIGTERM to all containers of a Pod when terminating it. Since the container running the clean-airflow-logs.sh script is currently sleeping, it may take a long time for it to react to SIGTERM. As a result, the container exceeds the configured termination grace period and gets killed by the Kubernetes Pod controller.

I'll happily provide a patch to address the issue. Are you willing to accept a contribution for this?

[0] see Bash Manual section 3.7.6

Hello! Is there any opinion/decision on the topic?