garybernhardt/selecta

Error when exiting with ^C under some shells

Opened this issue · 24 comments

If I run bash -c 'cat $(ls | selecta)' and ^C out of the selecta prompt, then I get the following error:

/home/michaelpj/bin/selecta:782:in `block in command': Command failed: "stty 4500:5:bf:8a3b:3:1c:7f:15:4:0:1:ff:11:13:1a:ff:12:f:17:16:ff:0:0:0:0:0:0:0:0:0:0:0:0:0:0:0\n" (RuntimeError)
        from /home/michaelpj/bin/selecta:779:in `pipe'
        from /home/michaelpj/bin/selecta:779:in `command'
        from /home/michaelpj/bin/selecta:772:in `stty'
        from /home/michaelpj/bin/selecta:565:in `restore_tty'
        from /home/michaelpj/bin/selecta:543:in `ensure in block in with_screen'
        from /home/michaelpj/bin/selecta:544:in `block in with_screen'
        from /home/michaelpj/bin/selecta:752:in `call'
        from /home/michaelpj/bin/selecta:752:in `with_tty'
        from /home/michaelpj/bin/selecta:536:in `with_screen'
        from /home/michaelpj/bin/selecta:44:in `main'
        from /home/michaelpj/bin/selecta:791:in `<main>'

If I try to run this in a bash shell directly, then I don't see the error, but my terminal becomes messed up: input stops appearing, but I can still enter commands.

I've tried with --norc --noprofile --noediting --posix, and that doesn't help.

I also see the error when I run zsh -c 'cat $(ls | selecta)', but have no problems running the command directly at a zsh prompt.

I don't see this using sh on my system (which is dash).

I'm sure this is some weird terminal interaction, but I'm afraid I don't have any more leads than that! In general, it seems to only happen when selecta is running inside a command substitution.

Are you on Linux? I noticed an issue with stty on Linux that doesn't occur on OS X:

rschmitt/heatseeker@6b2d27e#diff-a6432244543ebcd356024444b2fe297aR148

I'm not sure why you would only see this on ^C, however.

It doesn't seem to be related to the newline from stty. On OS X Yosemite, I can reproduce the error when running via bash -c, but can't reproduce the terminal breakage when running directly.

I just checked and this happens throughout Selecta's entire history. I suspect that it's something like: the shell is seeing the SIGINT first, starts cleaning up, and by the time Selecta sees it the TTY is already gone. But there must be some way to get around that, because fzf (which uses curses) doesn't have this problem.

@michaelpj, can you try it on the raw_input branch? I changed the stty command to put the terminal in raw mode, which solves the problem for me, and hopefully for you. If that seems to work, I'll merge it to master.

(Note that with the version of the command using bash, hitting ^C only kills Selecta, not the whole pipeline. I don't think there's any good way to make ^C send SIGINT to the entire process group, as it normally would, but also fix this bug.)

Yep, that fixes it for me. Things now work as I expect when running bash -c. Running it in an actual bash terminal still has slightly odd behaviour - I have to ^C twice to get back to my prompt. I don't know whether that's due to what you mentioned about the SIGINT only killing selecta.

However, running my pipeline it via -c or in a script doesn't seem to have this problem - everything finishes up nicely. So this is good as far as I'm concerned!

This changes behaviour in some of my scripts, this one for opening a file in vim for example.

Previously if I ^Ced within selecta it would drop back to the command prompt, now it proceeds to open vim. I'm guessing that's a side effect of it only killing selecta?

I can alter my scripts to account for this easily enough, just thought I'd raise the change in behaviour that some people might be relying on at the moment.

Yes - on closer inspection that's also what I'm seeing, it's just that in this case I don't actually mind! It looks like the command substitution evaluates to an empty string, rather than killing the caller.

Yes, the need for two ^Cs is the result of ^C only killing Selecta. Normally, ^C sends SIGINT to the entire foreground process group, which would contain the whole pipeline. With the terminal in raw mode, the ^C just shows up on the TTY and is read by Selecta.

Selecta could send SIGINT to its own process group in response to ^C. It's not even hard: Process.kill("INT", -Process.getpgrp()). However, this is scary.

Selecta does a lot of TTY stuff now, and I feel like I understand what it's doing. But signals and process hierarchies aren't my strong point. I'm not sure that an arbitrary process intercepting a ^C and then SIGINTing its own process group is right. It seems like it should be right, but this stuff is arcane and who knows? (Seriously, if someone knows, that would be great. ;)

I created a gist showing a minimal, isolated version of the original problem. I'm still stumped, though. https://gist.github.com/garybernhardt/d496c69bf9e7cc03ddbd

Well I now know more about subshells than I did at the start of the day. From what I can tell this is a feature of grouping commands in bash so scripts like mine linked above are working as expected.

What I should be doing if I want to use a subshell is generate the error myself. However, I've learnt about this all in the last half an hour so there could be nuances I don't know about.

I just pushed another branch, raw_input_2, that seems to fix this. It does the send-SIGINT-to-process-group trick that I mentioned above. I was skeptical of that technique, but Glyph says that it will probably work (https://twitter.com/glyph/status/573617810913431552), and he knows a lot more about this stuff than me.

@gshutler and @michaelpj , can you try the raw_input_2 branch and see whether it acts as you'd expect? Selecta's behavior there should be exactly as it's always been, except without the initial issue reported by @michaelpj in this thread.

👍

👍

Merged into master in abee7d3. Thanks for the excellent bug report and help figuring it out.

I'm able to reproduce this error by running find * -type f | selecta twice in quick succession. I'm on a Mac running the current HEAD of selecta using zsh 5.2 and ruby 2.3.0.

Using the vim integration as documented I can reproduce this error ~75% of the time.

I'm failing to reproduce it with zsh 5.2 and Ruby 2.3.0, so it must be caused by something sneakier. Can you try the steps below? Note that some of them involve clearing all environment variables, so you'll need to reactivate Ruby 2.3.0 via your ruby version manager unless it's the globally installed version.

  • Run zsh as env -i zsh -f -d and try to reproduce. This will clear all environment variables and won't load any zsh configs, including machine-wide ones.
  • Run zsh as env -i zsh, which will clear environment variables but still load your zsh configs.
  • Run zsh as zsh -f -d, which will leave your environment variables alone but not load your zsh configs.

Interesting, here's what I'm finding:

  • given env -i zsh -f -d I cannot reproduce the error
  • given env -i zsh I can reproduce the error
  • given zsh -f -d I can reproduce the error

I'm reporting this back in order to close the loop, but I'll obviously need to investigate further. If you have any ideas, let me know.

Awesome, that at least gives us a reference point. I think a good next step would be to compare the results of export when run in env -i zsh -f -d vs. zsh -f -d. In the former case, the only variable shown should be PATH. In the latter case, your zsh configs aren't being loaded, so one of those environment variables should be the culprit. Here's an idea for narrowing it down:

  1. Run zsh -f -d, then within that run export > foo.sh.
  2. Open a new terminal, run source ./foo.sh, and try to break selecta.
  3. Comment out a line in foo.sh.
  4. Repeat steps 2-3 until Selecta doesn't break. When it works for the first time, you know that the last line you commented out was the culprit. When you find that line, I'd recommend doing extra careful repro attempts with and without that line, just to be sure that it's not a false positive.
  5. If it still breaks when foo.sh is zero lines long, your computer is haunted.
bedge commented

@michaelavila, any update on this?
Seeing the same thing with same zsh/ruby versions.
Wanted to see of you had any results with the above test before replicating the effort.

This is only happening (for me) with stty from GNU, the BSD stty is working fine.

If you have coreutils installed through homebrew, then you're going to run into this eventually. Using the stty in /bin works fine.

Additionally, this doesn't just happen for me on ^C, selecting something will trigger the problem as well.

bedge commented

@michaelavila thanks for the stty tip, changing to /bin/stty in selecta fixed this for me.
I have too much other stuff hat needs coreutils in the PATH to go back to the osx bsd outdated versions of everything else.

Glad I could help, though I'm curious what the real solution is. I rely on coreutils as well.