Keyboard events lock up in release build
tcbennun opened this issue · 12 comments
Just tested out 0.5.0...
On my Linux system, Rust stable, with --release
build, the keyboard events lock up for seconds at a time. It's not precisely repeatable. Sometimes, the event will go through maybe half a second after the keypress, then sometimes, it'll take several seconds.
On occasion, I'll hit CTRL+R five times in quick succession, then a few seconds later, they'll all be detected at once.
I haven't been able to reproduce this in debug build.
Okay, so it's definitely a problem with the serial port (as expected): if I modify stdin_thread_fn
to echo my input, I can see that it's only CTRL+R that causes the thread to lock up temporarily - so it's blocking when attempting to acquire the mutex.
The mutex should only be locked by the main thread for 200 ms at a time, since that's the serial port's configured timeout. I can only assume that serial::unix::TTYPort
isn't respecting the timeout.
I thought perhaps having a large, 1024-byte input buffer was causing the problem, since serial-unix
seems to try to grab the whole buffer with a single read
syscall... tty.rs:132
... but reducing our buffer to 16 bytes didn't fix the bug.
(Also, the library does in fact set O_NONBLOCK
anyway, so we should get WouldBlock
instead of that syscall blocking.)
It's not TTYPort
's fault... it's not blocking at all. Unclear why the stdin thread sometimes fails to lock the mutex despite many cycles of locks-unlocks in the main thread.
... or is TTYPort
blocking after all? if I add a rprintln
at lib.rs:161, I never see it (whether in debug or release!)
Ugh. It might also have something to do with the sleep at https://github.com/esp-rs/espmonitor/blob/master/espmonitor/src/lib.rs#L160-L161 -- before I added that, the keyboard commands never worked at all, because for some reason the input thread never got a chance to lock the mutex.
But it seems like you're never getting there, which is odd. You'd think that serial output would stop at least long enough for there to be zero-byte reads or timeouts.
It's also weird that I don't see this problem (on Linux here too).
Re: clearing O_NONBLOCK
, it looks like they do a poll()
with a timeout before reading, and then only read if there's data ready. The problem with that, IIRC, is that just because poll()
claims there is data ready, it doesn't definitively mean that the read won't block. So I wonder if that's what's happening in your case. Actually, I'm not 100% sure about that; I can only find mention of that behavior in the man page for select()
, and the example case is if it's a TCP socket and the data later gets discarded because a checksum is bad or something. Not sure this applies to TTYs and/or poll()
.
Maybe the next thing to try is to time the .read()
calls to see if reads sometimes block forever, or even just block longer than the 200ms timeout?
I was "hoping" that this was just an annoying threading issue, in which case I was thinking to just forget about the threads and do the serial and keyboard reads one after another in the main thread (200ms timeout on the serial read, nonblocking poll check via crossterm), but if the serial library's read timeout isn't working, I don't think that will work either.
I just removed the threading as a test. It seems to work fine for me, both when there's actively new output ready on the serial line, and also after output stops. Can you give it a try and see what happens for you? Should be able to install with:
cargo install --git https://github.com/esp-rs/espmonitor --branch fix-input-hang cargo-espmonitor
Nice one @kelnos, that works much better in both debug and release. Tested both when the device is producing constant output, and when producing no/infrequent output. Tested on Linux and Windows. edit: only tested rust stable
P.S. Off-topic, but out of interest, why fork at startup for unix? I tested with the forking removed and it seems to work fine?
(I'm still perplexed as to how removing the threading has even worked. I wonder whether it's actually a bug with Rust's atomics? Unlikely, I know.)
P.S. Off-topic, but out of interest, why fork at startup for unix?
To ensure that even if the monitor exits/crashes unexpectedly, we'll still put the terminal back into "cooked" mode. I'm not sure if that's necessary on Windows, and Windows doesn't have fork()
at any rate, and I'm not sure how to achieve the same thing there.
(I'm still perplexed as to how removing the threading has even worked. I wonder whether it's actually a bug with Rust's atomics? Unlikely, I know.)
When I first wrote the threaded code, the thread::sleep()
wasn't there, and the input thread was never able to lock the mutex around the serial port. That seemed very strange to me, as I would expect the kernel to give the input thread at least a little bit of time between iterations of the serial reader's loop
. It's possible that the sleep()
I added just happened to be enough on my system but not enough on yours. (A fine example why "fixing" threading problems with sleep()
is rarely the right answer; shame on me.)
Overall I'm just really disappointed that I can't seem to find a combination of libraries that satisfy:
- Supports serial and raw keyboard input on both unix and windows.
- Both serial and keyboard input can be watched in a non-blocking manner in a single thread using
select
/poll
/etc. - Alternatively, support for
async
/await
on both the serial and input devices so I can kick off read/poll operations on both and thenjoin!()
them in a loop on the main thread.
I suppose if I spent some time I could get the third thing to work, but I already did spend several hours on it and it was a bit of a mess. This could be due to my relative inexperience with async
/await
in Rust, though.
Anyhow, I'll merge that branch and cut a new release, since it seems to be at least better then 0.5.0, even if I'm not happy with how the input loop works.
To ensure that even if the monitor exits/crashes unexpectedly, we'll still put the terminal back into "cooked" mode. I'm not sure if that's necessary on Windows, and Windows doesn't have
fork()
at any rate, and I'm not sure how to achieve the same thing there.
Makes sense, thanks.
Overall I'm just really disappointed that I can't seem to find a combination of libraries that satisfy:
- Supports serial and raw keyboard input on both unix and windows.
- Both serial and keyboard input can be watched in a non-blocking manner in a single thread using
select
/poll
/etc.- Alternatively, support for
async
/await
on both the serial and input devices so I can kick off read/poll operations on both and thenjoin!()
them in a loop on the main thread.
I'd be okay with serial
if it was maintained, because it's nearly there, really! There's a real problem with very promising crates becoming abandoned for years.