NOHUP External Script Hang on Backgroud after NETCAT Broken Pipe

Question

NOHUP External Script Hang on Backgroud after NETCAT Broken Pipe

DimplyKhan13 opened this issue 5 months ago · 2 comments

SUMMARY

When running a bash script, a different behavior happens when it's called from an external script, from when it's run by a user in terminal.
My script, runs a nohup, calling a second one, that does a netcat client connection, and after the connection ends by the server, uses zabbix_sender to update it on zabbix.

When the script is run manually, with the same parameters as zabbix external script, from inside the container terminal, the result is pushed to zabbix correctly. But, when zabbix call it, it hangs indefinitely.

I was able to track that the problem is in the way I keep the connection open, sending "#" every second. The idea is that when the connection is closed, the response is a broken pipe, and the script continue as intended.
This does not work when zabbix call the script.

The solution, for now, was to update netcat with "apk add netcat-openbsd", since in newer versions the parameter -q is available and give the same results. Could this be a default? Have the updated version on the container, since it's already installed?

OS / ENVIRONMENT / Used docker-compose files

For the problem, the default "zabbix/zabbix-proxy-mysql:alpine-6.0-latest" was used. Only parametric values for the environment were changed, like passwords, hostnames, ips and memory and cpu limitations. A folder was mapped for the scripts also.

CONFIGURATION

Scripts:

start.sh -

/usr/bin/nohup /bin/sh /usr/lib/zabbix/externalscripts/get.sh $1 $2 >> /usr/lib/zabbix/externalscripts/output.txt 2>$1 &
echo "Started"

get.sh -

hostname=$1
shift

result=$( (echo "$@"; while true; do sleep 0.1; echo -n "#"; done) | nc 10.0.0.1 1234)

zabbix_sender -z localhost -s $hostname -k script.result -o "${result}"

STEPS TO REPRODUCE

Create a default zabbix container, mapping a folder to /usr/lib/zabbix/externalscripts/ with 2 scripts, start.sh and get.sh and a txt file, output.txt.
Point the netcat to a server that will take 30 seconds to a minute to respond. Something like:

socat tcp-listen:1234,reuseaddr,fork \
    exec:(sleep 30; echo "Done")

** The original response is done by a complex structure, and can take milliseconds, or minutes, depending on the parameters. **

Create an external script item and a trapper item to run and to receive the response from the zabbix sender.
Run the script from the docker, and then from an item in zabbix.

EXPECTED RESULTS

It is expected that both ways to start the script result in the trapper receiving the "Done" string.

ACTUAL RESULTS

The result from running from container was as expected, however when zabbix called the external script, no result was sent, and the script kept running in background. Output was updated every second with "brokenpipeerror: [errno 32] broken pipe" error until stopped manualy.

Answer 1 · 2024-05-05T09:21:58.000Z

it is better to use cron job for such tasks, not Zabbix item.

Answer 2 · 2024-05-06T13:33:32.000Z

Since your script can take so long to finish (from miliseconds to minutes), i'd say using an item is bad because it "locks" a proxy poller. If thoose items pile up, your proxy will be "stuck" not gathering any info just waiting for reponses. Also, consider that proxy timeout maxout at 30seconds

Timeout 	no 	1-30 	3 	Specifies how long we wait for agent, SNMP device or external check (in seconds).

So, perhaps you are better of using some other utility to gather that info and send it to zabbix. I can think of a couple of ideas:

use timeout in the current script to be lower than proxy timeout (something like timeout 29 netcat ...)
using crontab/scheduler in another container to run your tasks and zabbix receives a trap
using crontab/scheduler in another container to run your tasks and zabbix gather just the result from there
create a new docker image based on zabbix and make cron work there, so you can have all-in-one container
create a new docker image based on zabbix and make a new entrypoint that loops in the background what you need, and then calls default entrypoint to start proxy. This way you can have your own "cron-ish" running in the background and calling your scripts, and later zabbix polls that data in.

In every scenario be careful about stale data, error handling, etc