Zombie processes remain when generating multiple plots despite including the --exit parameter.

Question

Zombie processes remain when generating multiple plots despite including the --exit parameter.

topherbuckley opened this issue 5 years ago · 5 comments

Issue:

Generating multiple plots with stringed together bash commands (i.e.) [plot 1 generator] & [plot 2 generator] & [plot 3 generator] generates a separate process for each, thus hiding all but the last plot from being able to ctrl+c exited. Even if the data stream runs out or is turned off, and each plot window is closed, these processes remain behind, and the plot windows will reappear after the data stream starts again. The only method for truly closing these processes that I've found is to find the PID and kill it through the terminal.

Expected Behavior

Using the --stream and --exit parameters, I'd expect these background processes, including the feedgnuplot, gnuplot, and window process for each to die.

Background

feedgunplot v1.48
OS: Ubuntu 18.04.03
Windows Manager: Regolith based on i3wm

A minimal example of the code I am using to generate the plots is as follows:

{
adb logcat -c
while true; do
        sleep 1;
        adb logcat -v raw $logcatTag *:S;
done
} | feedgnuplot --lines --stream $updateTime --exit &

{
adb logcat -c
while true; do
        sleep 1;
        adb logcat -v raw $logcatTag2 *:S;
done
} | feedgnuplot --lines --stream $updateTime --exit

All the bash variables are standard strings that I set through a CLI.

Is there any standard way of generating multiple plot windows without having to search for their PID and manually kill them every time I use feedgunplot? Would you recommend some other approach? Is there an easy way to access the subplot functionality from gnuplot via a feedgnuplot feedthrough type command? I saw a hanging open issue here. Is there any progress on that? You mentioned some things being possible, but not exactly what the issue creator had in mind. Is there any documentation or examples of doing what is currently possible?

Answer 1 · 2019-11-11T18:57:00.000Z

Zombie processes shouldn't be happening, and I'll try to reproduce in a bit.

In the meantime, some comments about representing multiple datasets.

feedgnuplot makes a fundamental assumption of one-window-per-feedgnuplot-process, and I'd rather keep assumption

Currently it also makes an assumption of one-plot-per-window. This can be loosened with the 'multiplot' feature of gnuplot, and that's what the linked issue is about. Doing this is possible, but would require a lot of typing, and realistically I'm not going to be working on this in the near future. If you want to add this feature, I'd definitely appreciate the help. The first step would be deciding on what the interface should be. How should the user communicate on the commandline that we want a multiplot, and which data should go where, which settings apply to which subplot, and such. Proposals welcome.

Finally, the way I would solve your problem is to have one window and one plot, but to render multiple datasets into that one plot. This is well-supported, and is the "normal" approach. You're plotting time-series, right? If so then all your datasets have a common x-axis (time). For your specific application, how many different datasets do you have? Do they all have the same units? If so, you can plot everything on the y axis. If you have two different ones, you can use the y-axis and the y2-axis. If you have more, you'll need to use either y or y2 for the rest, and maybe manually scale some things. If you attach a bit of your actual data, I can give you a usable feedgnuplot command.

Answer 2 · 2019-11-12T03:30:56.000Z

Alrighty. I can reproduce the Zombie processes regardless of what input stream, so a step by step for how to reproduce would be:

Turn two or more data streams on
[data stream1] | feedgnuplot --exit [other params|flags] & [data stream2] | feedgnuplot --exit [other params|flags]
Stop both data streams
Close both plot windows
Start both data streams

At this point you should see both previously closed plot windows reappear and plot as normal. (If used intentionally, this can actually be quite handy, but I don't think this is the intended behavior right?)

Alright, I may have some time a few weeks from now to look into multiplot functionality with this. I haven't even looked at your source code yet, so would you have any suggestions as to where to begin fiddling with this? Or some sort of conceptual overview of how to pass parameters or flags through to gnuplot directly?

Thanks for your suggestions, but that would get messy real quick. Currently I have three plot windows, each with three separate data sets (three lines per window) and scaling is quite different in each. I'm able to do what I need at the moment but have to stop all my data streams, then running pkill gnu cleans up all the zombies.

Answer 3 · 2019-11-12T03:55:40.000Z

topherbuckley <notifications@github.com> writes:

Alrighty. I can reproduce the Zombie processes regardless of what input stream, so a step by step for how to reproduce would be: 1. Turn two or more data streams on 2. [data stream1] | feedgnuplot --exit [other params|flags] & [data stream1] | feedgnuplot --exit [other params|flags] 3. Stop both data streams 4. Close both plot windows 5. Start both data streams At this point you should see both previously closed plot windows reappear and plot as normal. (If used intentionally, this can actually be quite handy, but I don't think this is the intended behavior right?)

OK, great. I'll take a look

Alright, I may have some time a few weeks from now to look into multiplot functionality with this. I haven't even looked at your source code yet, so would you have any suggestions as to where to begin fiddling with this? Or some sort of conceptual overview of how to pass parameters or flags through to gnuplot directly?

Don't worry about the code just yet. I'd like to decide on the preferred interface before anybody does any coding. Let's pretend the tool does what you want already. What would you expect to type to tell this already-made future feedgnuplot what to do? You're reading one stream of data on stdin, but generating multiple gnuplot plot() commands; one per subplot. Each subplot has its own axes, settings and so on. They're really independent in most ways. So you have to communicate to the tool which dataset (referred to as "curves" in the code) goes where. Maybe this would be doable similarly to how styles are currently done: "--subplot curveid subplotid". That's easy-enough. Can you go through the list of currently-supported feedgnuplot options, and see which ones become ambiguous with multiplots? Off the top of my head, stuff like --xlabel and --ylabel and --title and --set. Probably others. For each such ambiguous option we need to figure out how to set up the interface. It could get messy quickly. I went through this process recently with the numpy-gnuplot interface I maintain: https://github.com/dkogan/gnuplotlib And I ended up doing something a subparser-like thing. Where you'd more or less pass a separate blob of options for each subplot. You use it like this: https://github.com/dkogan/gnuplotlib/blob/master/demo.py#L317 I haven't written docs or made a release with this yet because I haven't tested it enough, and the person who requested this feature has disappeared. It seems to work, though. In feedgnuplot-land it'd look something like this maybe: feedgnuplot \ --subplot curveid0,curveid1 '--xlabel x --ylabel y --with points' \ --subplot curveid3 '--xlabel x --with lines --domain' \ --stream Currently everything is either a "plot option" or a "curve option". With multiplot support "plot options" will split into "subplot options" and "process options". Something like --title or --xlabel is a "subplot option" (applying to each subplot separately), but --stream would be a "process option" (applying to the whole thing). Is that a reasonable API? Too weird? Can you imagine how you would represent your use case with something like that? Doing this this would involve lots of typing, it would be intrusive, and it would be error-prone. So I'm both not excited about doing it myself, and also not excited about letting somebody else do it. If we agree on the API specifics, you should take a look at the code. Then if you feel comfortable, we can proceed. Thanks for stepping up!

Thanks for your suggestions, but that would get messy real quick. Currently I have three plot windows, each with three separate data sets (three lines per window) and scaling is quite different in each. I'm able to do what I need at the moment but have to stop all my data streams, then running `pkill gnu` cleans up all the zombies.

Yeah. OK.

Answer 4 · 2019-11-14T08:18:47.000Z

Hi. I looked at your "zombie process" recipe, and if I'm understanding what you're doing correctly, then you aren't actually getting zombies here. What does "stop the data streams" mean? The normal end-of-data sequence is:

the process sending the data to feedgnuplot exists, which closes its end of the pipe
feedgnuplot detects that the sender is no longer there, and does whatever --exit tells it to do

If due to some bug everything looks shut down, but there's still an existing feedgnuplot or gnuplot process taking up memory, I'd call that a "zombie". In your case when you're "stopping the data stream" I think you're simply pausing sending data through the pipe. feedgnuplot then just sits there and keeps waiting for data. None of the --exit logic kicks in because the pipe is still open.

At any point during the plot updates with --stream the plot window can be closed by the user (as you have seen), but the window will be re-created when the plot is refreshed (also, as you have seen). It would actually be a nice feature to detect the user closing the window, and then exiting the program. But that's actually pretty hard to do. With static plots (no --stream), I send "pause mouse close" to gnuplot. This has the effect of waiting until the user closes the window, and then exiting the program. The problem is that if we try to do that with --stream, gnuplot won't let me send it an update-plot command: it'll still be sitting there, waiting for the window to be closed. If you patch gnuplot to make some sort of interruptible "pause mouse close", then hooking into that from feedgnuplot would be pretty straightforward, and I'd be happy to do it.

Answer 5 · 2019-11-18T03:10:46.000Z

What does "stop the data streams" mean? The normal end-of-data sequence is:

the process sending the data to feedgnuplot exists, which closes its end of the pipe

feedgnuplot detects that the sender is no longer there, and does whatever --exit tells it to do

I think you may have isolated the issue with this statement. For the use case I was on, I was stopping the Android Application that was feeding the adb logcat stream (i.e. just not adding new data rather than actually closing the pipe). I tried killing the adb process while the feedgnuplot process is running to see if this would properly close the pipe, but the adb process keeps restarting
for some reason. I can properly kill the adb process only after I kill the feedgnuplot process. So it appears both my data stream (adb logcat) and feedgnuplot are waiting for each other to end before properly exiting, and this creates this "zombie"-ish situation.

Since you may not be familiar with adb, and I am surely showing my bash ignorance here, so excuse me for that, but how would you properly close the pipe in the stream example you give:

$ while true; do sleep 1; cat /proc/net/dev; done |
gawk '/wlan0/ {if(b) {print $2-b; fflush()} b=$2}' |
feedgnuplot --lines --stream --xlen 10 --ylabel 'Bytes/sec' --xlabel seconds

Of course, I'd expect the pipe to properly close if the loop had some limit or condition in which it should no longer loop, but is there any way to properly close the pipe in an infinite loop like this?

At any point during the plot updates with --stream the plot window can be closed by the user (as
you have seen), but the window will be re-created when the plot is refreshed (also, as you have seen). It would actually be a nice feature to detect the user closing the window, and then exiting the program. But that's actually pretty hard to do. With static plots (no --stream), I send "pause mouse close" to gnuplot. This has the effect of waiting until the user closes the window, and then exiting the program. The problem is that if we try to do that with --stream, gnuplot won't let me send it an update-plot command: it'll still be sitting there, waiting for the window to be closed. If you patch gnuplot to make some sort of interruptible "pause mouse close", then hooking into that from feedgnuplot would be pretty straightforward, and I'd be happy to do it.

I'm not familiar enough with gnuplot to know the meaning/workings of sending "pause mouse close", but this may be of interest. The initial question is regarding how to detect a Firefox window closing, but the answers are pretty generalized.

Regarding the multiplot functionality:

Currently everything is either a "plot option" or a "curve option". With
multiplot support "plot options" will split into "subplot options" and
"process options". Something like --title or --xlabel is a "subplot
option" (applying to each subplot separately), but --stream would be a
"process option" (applying to the whole thing). Is that a reasonable
API? Too weird? Can you imagine how you would represent your use case
with something like that?

I think this would deserve a whole separate issue thread, but in short, what you describe as an API sounds like what I would expect, though I do see your point of things getting messy quickly. Maybe we are trying to ask too much of a one-liner, or even a non-OOP language like bash. Setting things up in an OOP approach in your gnuplotlib way, where each subplot is a separate object, with attributes like xlim, etc., makes a lot more sense to me. You could try to force an OOP approach like this but maybe its best to leave this to other tools?