fuzzball-muck/fuzzball

Slowness when pasting commands with lots of MUF instructions

Opened this issue · 13 comments

  1. Create a testprog command with this MUF:
@program testprog.muf
1 99999 d
i
$PRAGMA comment_recurse

: main ( s --  )
  "Start." .tell 0 begin ++ dup 100000 > until "Done." .tell
;
.
c
q
@set testprog.muf=M3
  1. Run 'testprog' repeatedly by hand very fast (Like by pressing up for previous command) Notice how there's no problem.
  2. Now paste in a big line like this and watch it grind very slowly:
testprog
testprog
testprog
testprog
testprog
  1. Change it to do 100 iterations instead of 100000 and the problem goes away

Other users aren't disrupted while this is happening. @q responds sluggishly as well.

I don't know why there's a difference between when I paste the commands and when I type them, even if I enter them very fast. Maybe all the commands in the paste are going out in the same TCP packet and getting processed all at once by the server? Maybe that's the difference?

Hopefully you're able to replicate this with your clients. I've tried two clients on my Linux console. tinyfugue 5 and just the raw telnet command, using GNOME terminal and Terminator terminal.

My fault for writing monstrous unwieldy MUF programs maybe. 😂

I don't know if this is the same problem or not, but I often copy/paste programs into @edit and I get the behavior where it "chugs" forever even though nobody else on the MUCK is effected. Pasting in the program in smaller chunks always goes faster/instantly.

I haven't researched the problem because honestly I didn't care enough :) But I'm 99% sure the problem is how input buffers are processed in the MUCK. Basically, the MUCK nibbles at the incoming bytes, making linked list of inputted lines. I would bet that 8k of bytes is the magic number where stuff starts to get slow, because the MUCK really likes its 8k buffers.

Anyway, I think the "lag" I see (and maybe you're seeing as well, if I am understanding your problem correctly) is beause it takes the MUCK many, many time slices to actually nibble away your entire paste. Also, when you type things in, you're providing your data in naturally separated lines. The MUCK performs the fastest when you give it a full line of code and then have a timeslice or more of pause, rather than give it a wall of text.

I'm not sure I'm explaining the problem well; if you're curious I can probably find the relevant lines of code (I believe they are in interface.c for the most part, if I remember right). Anyway, this is profoundly unlikely to be fixed unless we switch the MUCK to multi-threading. You can ultimately blame the MUCK's timeslicing and input handling for this one.

Basically the MUCK is combatting one person's ability to lag everybody by pasting in huge blocks of text, so I'm not even sure this is something we really want to fix.

@tanabi There's a rate limiter for input. tp_command_burst_size tp_command_time_msec and tp_commands_per_time can be altered to make it so that effect is far less pronounced.

But this appears to be unrelated (or tangentially related?) and will happen even if you paste in just four lines (on my server), but if you type those same four lines really fast it will not.

Also, the program will actually run slower. It's not just a matter of it not getting to my commands fast enough. That's why I put the 'start' and 'done' lines in the code, so you can see it actually starts the execution and then gets really slow in the middle of running the program.

I still think the key is that if you're typing or even pressing up then enter on a line, no matter how fast you're doing it, it's still an eternity in the computer world and you're giving the MUCK stuff in nibbles that it can accomodate naturally. :) So your original point that it's probably coming all in one packet is likely correct. However, I can't expain the program running slower, other than the MUCK is taking up more time slices to actually process your input. The MUCK runs with a heartbeat of stuff it does, and you're giving it more stuff to do all at once in your packet. When you give it things at a slower pace, it's having more time to process your MUF loop between commands, thus making it go faster.

I think the only way to see it for sure would be to add some printf's to the main loop or watch it in GDB :D That would be an interesting show, blink and you'll miss it though :)

That's my theory anyway, you seem to know this stuff more than I do at this point :D Be curious if @wyld-sw knows anything.

My currently-limited understanding of the FB networking code does fit with @tanabi's suggestion. As we know, this is one of the reason tf's /quote -dsend feature is helpful (or at least used to be!), since it sends files one line at a time. More wiggle room for the server to context-switch between processes.

I'd be interested to know if the situation is different (better, worse, or the same?) in FB6. I do know that there were some changes in FB7 output handling (#50), but maybe that's not related as we're talking about input.

Still, improving my understanding of this is certainly a goal of mine. If there's room for improvement, I'm all for it. Also - and I know it's a big feature - but it might be good to consider multithreading for future releases (fb8+) - doing that might be less painful if we do decide to rewrite in C++. I'm certainly open to discussion on any of that.

In theory, multithreading the MUCK actually isn't that bad -- there is a very clear separation of 'duties' and using a threadsafe queue-communication system would work very well.

In practice, the MUCK is massively not thread-safe. There's a lot of use of static buffers and occasional use of globals especially in the world of command processsing where it is assumed that any any given point, commands are being processed for only a single player.

I have documented as many places as I have noticed that are not thread-safe, such that a simple grep for the word 'thread' will locate them. To give an idea of the scope of work, there's currently 32 such instances not counting resolver.c which apparently is already multi-threaded (probably to provide non-blocking DNS resolution).

That being said, I've only documented / examined about 50% of the code base give or take, so that number will continue to rise. Plus I'm sure there's things I have missed

Anyway -- I think multithreading the MUCK is certainly worth the effort because it will greatly increase the capacity and honestly make the dream of a 100% MUF-based command system more "sane" because we could run MUF at the very least in its own thread, with parallel MUF programs being a theoretical possibility (though there should be some 'sanity primitives' to prevent MUFs that shouldn't run in parallel from doing so ... anyway, that's a whole kettle of worms and maybe not a good idea).

This is a more severe problem than I originally thought.

It doesn't just slow you down, it slows everybody on the entire MUCK.

So all somebody needs to do to cripple my MUCK is to copy and paste in this:

say Test
say Test
say Test

(Because my 'say' program is a lot of instructions. It does some fancy stuff with color codes.)

That's all it takes to totally cripple the server for everybody. And the problem doesn't go away until everybody stops talking at the same time. So effectively they may have just crashed the server by pasting in those three lines.

Sadly I can't launch my new MUCK in this state :( So I guess I'm going to have to hunt down the source of this.

Hmmm actually it doesn't seem to hamstring everything. It seems to just be other long-lived programs.

I've reproduced this on the development server.

Run this:

testprog
testprog
testprog
testprog
testprog

And then as another user run testprog2 and you will see this other program by this other user being hamstrung as well.

But if I run a short-lived program like page #mail or something, it returns quickly.

Looks like if I take the instruction-intensive code in my color library and set it to preempt, the problem goes away.

That makes sense, preempt of course bypasses multitasking and it is designed to resolve the kind of problem you're having here.

You may be able to fix this problem by changing the instruction tune parameters ... not sure. This is a difficult one to fix without multi-threading or gutting the server.

Yes I think the multitasking code is very obtuse, too. But it's possible this is just some kind of oversight.

If I run these commands with quarter-second delays between them, it finishes almost instantly. If I paste them all at once, they take minutes of waiting to do the same task.

There's something in there that is supposed to be 'Okay we're idle, time to do some work' which is putting it off for reasons unknown, due to something that seems unrelated. Why should 'I have more commands after this one is done' have any affect on the speed of processing of this command? But it does.

I also wonder if perhaps this is a product of the time in which the code is written. It probably was the case that, back then, this code would have taken minutes to run whether this issue existed or not, and therefore it was impossible to even realize it was an issue back then.

I'll poke around. Once I have the time I'll stand up an FB6 instance and try to reproduce it there. If this is a regression, then that will make it easy to find, just by looking for the commit that introduced the problem.

If you poke around and come up empty, let me know and I'll take a whack at looking at it. :)