Bill-Gray/PDCursesMod

wingui: Multi-threaded code responds very slowly

slipher opened this issue · 8 comments

I'm trying to update the pdcursesmod used with Daemon Engine. With the latest version of the library, I found that in some configurations curses takes several seconds to responds to a keystroke. The game engine runs dozens of frames (querying getch() once per frame) in between when the key is actually pressed, and when getch reports the key press. After bisecting, I found that the problems started with commit dd59e4c. It seems that the multi-threaded event code introduced there does not always report keystrokes in a timely manner.

If you ask me, it would be better to remove the input thread. Having a second thread seems like an inordinately expensive/power-hungry/etc. solution for a library focused on doing micro-optimizations to repaint as little of the screen as possible.

A bit more information: I'm testing on Windows 10. Whether the bug appears seems to be related to the CPU usage patterns of the main thread. If I set the FPS very high (so that the main thread does not sleep or at least not much), the bug does not occur, whereas if the main loop is sleeping 60 ms each frame, then it does.

Well, that's a puzzler. Complicating matters is that we reverted commit dd59e4c two years later, in commit 25ead98. (*) However, as the latter cites, "there were conflicts that had to be resolved, so this wasn't a pure revert." There were plenty of changes in the intervening two years.

Not sure where to suggest looking to debug this. I'd probably try putting in some code to emit a time stamp to a file when a WM_KEYUP event is received in WndProc() in wingui/pdcscrn.c, and then emit a time stamp in a different file when your program actually gets the key via getch(). If those time stamps resemble one another closely, then the problem is that (somehow or other) the key is getting lost in the bowels of Windows for a while. If not, you could add some more time-stamping to see where the hold-up is.

Are mouse events similarly delayed? I'd assume so, but... that may also give us a clue as to where to look.

(*) <rant ON> Agreed on dual-threadedness not being, in general, a Good Idea. For me, the issue hasn't been expense. Roughly the same number of CPU cycles were expended either way; they just got split up among two threads. No, the problem (for me) was that because the threads had to communicate a lot, there was plenty of opportunity for weird, irreproducible bugs. It made me think of this (heavily modified) quote:

You have a problem. You decide to fix it by adding another thread. Now you have five problems.

The X11 port used to be dual-threaded; William McBrine figured out a way to use a single thread in PDCurses, and I eagerly took that for PDCursesMod. WinGUI started out single-threaded, and then I (thought I) had to add a second thread. Somebody figured out that I was wrong and we joyously returned to a single thread (this was all before PDCursesMod was even on GitHub.) And then, as you saw, it returned to dual-threadedness for two years.

I've used multi-threading to say, "Here are 100 problems to be solved that are independent of one another; split them up across however many cores the machine has and report back when you're done." That works a treat. But I hope never to see multi-threadedness in PDCursesMod again. </rant>

Oh, I should have thought of this while writing the above reply : if you use commit 25ead98, right after the reversion to single-threadedness, do things work? Such is devoutly to be hoped; your bisecting can then be limited to commits since 2022 July 24.

I think I may have it. At least, I have a contributing factor.

If you look at PDC_check_key() in wingui/pdckbd.c, you'll see that we don't actually check for new key hits every time. Doing so (in the manner we were doing it at the time) limited us to 1000 key checks per second. If you had something like

while( !kbhit())`
{
    /* do something */
}

you could only loop 1000 times a second, maximum, even if "do something" was nearly instantaneous (see issue #197 and the comments in the tests/keytest.c program).

The way I implemented it, though, we're relying on WinProc() getting called regularly. If it's not (perhaps because other processes are crowding it out), key hits won't be detected. So we basically need to check Windows messages every time PDC_check_key() is called.

Try this version of wingui/pdckbd.c in place of the currently posted one. In it, PDC_check_key() basically says : if there isn't a key already in the queue, check Windows messages, until either we find a key or run out of messages.

Seems to work fine here after non-exhaustive testing (under Wine; not at my Win10 machine at present). The logic seems much more logical to me. The tests/keytest.c program even runs slightly faster with it. And I think it'll fix this issue. What's not to like? (Please let me know if it does or doesn't fix your actual problem, though.)

Try this version of wingui/pdckbd.c in place of the currently posted one. In it, PDC_check_key() basically says : if there isn't a key already in the queue, check Windows messages, until either we find a key or run out of messages.

Seems to work fine here after non-exhaustive testing (under Wine; not at my Win10 machine at present). The logic seems much more logical to me. The tests/keytest.c program even runs slightly faster with it. And I think it'll fix this issue. What's not to like? (Please let me know if it does or doesn't fix your actual problem, though.)

Yep, thanks. With this file applied on the PDCursesMod master branch, the Daemon engine responds smoothly to inputs.

Just spent a good bit of time looking at other WinGUI keyboard, mouse, and resizing issues that have been raised around this code (there have been several) and have convinced myself that we won't cause past issues to recrudesce. Between that, testing on my end, and your "works now on your machine", I've applied the fix.

I'm confused - did we have a 4.4 release? There is no tag for it yet (if just missing, then please create/push).
If there wasn't a 4.4 release this is one of the additional bugfixes - please consider to do one.

I'm confused - did we have a 4.4 release?

We did not. I do releases on request. Usually, all that's involved is checking version constants and updating HISTORY.md and USERS.md, then going through GitHub's reasonably straightforward release procedure.

Since you've requested, I've released 4.4 (and it does include the above bug fix.)