Naheel-Azawy/stpv

Long running previews are not aborted

Closed this issue · 8 comments

This piece of code kills a preview if it takes too long to run, right?

stpv/stpv

Lines 163 to 177 in 18ab12c

# Because this script is expected to run inside a file
# manager. Now imagine scrolling over lots of files
# quickly. This will create tons of processes that are
# not so light... Keep in mind `convert` runs a lot in
# here... So, this will save your life and keep your
# machine safe!
real_main "$@" &
real_pid=$!
count=0
while [ $count -lt 500 ] && kill -0 $real_pid 2>/dev/null; do
count=$((count + 1))
sleep .01
done
ps -o pid= --ppid $real_pid | xargs kill >/dev/null 2>/dev/null
wait $real_pid

Today I noticed that if I select a ~1MB JSON file, the previewer in lf just stucks at Loading... for 5-10 seconds (it runs jq to generate JSON with syntax highlighting). I tried to investigate why the code above doesn't work in that case but failed.

This piece of code kills a preview if it takes too long to run, right?

Yes.

Very good observation, thanks.
I guess it should be fixed by now. Please give it a try and let me know.

What was happening?

Assuming 92570 is the stpv process

> pstree -p 92570
stpv(92570)───stpv(92603)───jq(92615)

If we check jq's parent before stpv kills itself

> ps -oppid 92615
   PPID
  92603

Once stpv dies, we see that jq is still alive and has been orphaned

> ps -oppid 92615
   PPID
      1

Now the interesting part is that we have two stpv processes (92570 and 92603).
When I first wrote the self killing part this wasn't the case. After checking real_main and handle_now, I guess a subprocess is created here

echo "$handlers" | while read -r handler; do

After checking the way process are killed, the problem now is a bit more clear

ps -o pid= --ppid $real_pid | xargs kill >/dev/null 2>/dev/null

This kills only one level below the main stpv process. In our example, it will only kill 92603 and leave jq(92615) orphaned.

The solution is to kill all descendants of the main stpv process.

Thank you for such a detailed explanation!

I've tried the new version and it still does not work. It also keeps loading when I select another item (it was before but I forgot to mention it in the original message).

Ok, so I first tried with a ~2MB json, this took less than a second on my machine. I guess your file is stored on an HDD or a network storage?
So I tried with a ~240MB file. Now running jq alone takes long enough that I wasn't patient enough to not kill it.
But still, stpv seems to be functioning as expected on my machine.

stpv (master)> ls -lh deadbeef.json
-rw-r--r-- 1 naheel wheel 243M Apr 10 23:04 deadbeef.json
stpv (master)> time stpv deadbeef.json
Terminated

________________________________________________________
Executed in    5.61 secs      fish           external
   usr time  483.03 millis    0.00 micros  483.03 millis
   sys time  226.79 millis  390.00 micros  226.40 millis

I also tried it under the latest version of lf and it still seems to be working well.

The following loop is what's responsible of deciding how long the process should stay alive.
https://github.com/Naheel-Azawy/stpv/blob/master/stpv#L172
The combination of sleeping for 0.01 seconds and counting up to 500 means it will stay alive for about 5 seconds.
Calling kill and counting probably makes take slightly more time but it shouldn't be much (maybe that's different on your machine?).

Maybe try to reduce that 500 and see if lower values work better for you?
May also ask for your cpu info?

I'm sorry for a long repsonse.

I've tried changing 500 to a much smaller value (10) and it works! Surprisingly, I still got highlighted JSON. (I assume 10 "counts" was enough to get very first lines highlighted?)

Am I right that kill command here checks whether $real_pid is still running? Wouldn't it be better to check if [ -e "/proc/$real_pid" ]? It would avoid spawning kill every 10ms.

stpv/stpv

Line 172 in 7691385

while [ $count -lt 500 ] && kill -0 $real_pid 2>/dev/null; do

Hello again,

I apologize for my slow response too. I got busy and forgot about this.

Am I right that kill command here checks whether $real_pid is still running? Wouldn't it be better to check if [ -e "/proc/$real_pid" ]? It would avoid spawning kill every 10ms.

That's right. I think -e /proc should do well. I replaced it, thanks.

I've tried changing 500 to a much smaller value (10) and it works!

Ok, so I added the TIMEOUT option that is set to 5 seconds by default. For your case it should be 0.1

Surprisingly, I still got highlighted JSON. (I assume 10 "counts" was enough to get very first lines highlighted?)

Weird. As far as I understand, jq should parse the whole file before it prints it.

I created tests/test-self-killing.sh to test if this is working as expected or not. Feel free to mess around with it.
Here's a test run on my machine

stpv (master)> ./tests/test-self-killing.sh
>>> creating big json
-rw-r--r-- 1 naheel wheel 219M Apr 30 21:33 /tmp/testjson/fooo.json
>>> running jq alone
  {
    "aaaaaaaaaa": 99998
  },
  {
    "aaaaaaaaaa": 99999
  },
  {
    "aaaaaaaaaa": 100000
  }
]

real    0m17.717s
user    0m16.463s
sys     0m1.772s
>>> running stpv
Terminated

real    0m5.522s
user    0m0.429s
sys     0m0.192s
stpv (master)>
stpv (master)> echo 'TIMEOUT=0.1' >> ~/.config/stpv/config.sh
stpv (master)> ./tests/test-self-killing.sh --stpv-only
>>> creating big json
-rw-r--r-- 1 naheel wheel 219M Apr 30 21:34 /tmp/testjson/fooo.json
>>> running stpv
Terminated

real    0m0.152s
user    0m0.120s
sys     0m0.033s

I use dash as /bin/sh and it doesn't have time built-in command, which causes this error:

>>> creating big json
-rw-r--r-- 1 nikita nikita 219M May 14 16:45 /tmp/testjson/fooo.json
>>> running jq alone
./tests/test-self-killing.sh: 37: time: not found
>>> running stpv
./tests/test-self-killing.sh: 41: time: not found

I think the shebang for the test should be changed to /usr/bin/env bash to fix it.

I'm not a huge fan of forcing bashism. Maybe better fall to /usr/bin/time if available.
Otherwise, bash -c ./tests/test-self-killing.sh should be ok I guess.

I'm not a huge fan of bashisms either, just thought it would be an appropriate quick fix. Running with bash explicitly is a good solution, given that it's just a test.