SvarDOS/bugz

System crashes when CTRL+C is pressed on CMDLINE and FDAPM is loaded

Closed this issue · 29 comments

When I press CTRL+C at the SvarCOM command prompt and FDAPM is loaded, the system crashes. Other config.sys / autoexec.bat entries does not seem to have an effect on this.

Quite not sure what's to blame. I'll try with EIDL...

Update: Only happens if FDAPM is loaded via autoexec.bat. If I spawn it from the command line, it seems to work. For now everything tested under QEMU.

Update 2: EIDL the same, works if called via command line. Crashes if loaded via autoexec.bat.

Interesting. I cannot reproduce this under VirtualBox, probably QEMU is doing something differently.

image

SvarCOM installs its own (very crude) CTRL+C handler:

image

This handler always says "kill the current program". I suspect that when FDAPM or another idle TSR is tickling, CTRL+C might happen when the TSR is active and things go south when the kernel tries to kill it (as instructed by SvarCOM) becomes some INTs are left hooked and dangling.

I suppose that the CTRL+C handler should check who is active and return with a simple iret if it is something that is a TSR. Not sure how to detect this. I will have to document myself in this topic.

Update 3: works with EDR command.com under QEMU. There is a problem with 4DOS: this hangs on launch (independant of FDAPM).

BUT: On my real machine (Pentium MMX) everything works!

So the question is: is there anythin broken in the kernel which is only triggered by QEMU, or is it something in QEMU itself. Perhaps my Mac version of QEMU is flawed.

is there anythin broken in the kernel which is only triggered by QEMU, or is it something in QEMU itself.

Or, more realistically: is it a SvarCOM bug that is being exposed by how QEMU reacts to HLTs... I'm pretty sure my CTRL+C handler is too simple to be correct.

Maybe it would be enough to check who the parent process is (through the current process PSP). And if it points to itself - never allow aborting. So the CTRL+C handler would essentially have to issue an INT 21h/AH=62h ("get current process PSP segment") and check the parent field (offset 16h) for a sane value (ie. not 0 and not itself), and ONLY THEN allow DOS to kill the process. I'm not sure this will work as I think it might, but that's my only theory so far.

I will try to use QEMU to reproduce this and then try to fiddle with SvarCOM's CTRL+C handler.

Nope, too fast: it also happens on my real machine. But at least 4DOS does not hang when it starts.
Groß (IMG_0370)

It worked the first time I tries on my PC because svarcom was named SVARCOM.COM. But it set COMSPEC to COMMAND.COM so it did not reload itself but EDR command.com.

The "ERR 3, FAILED TO LOAD..." message comes from rmod.asm (ie. the resident SvarCOM code that is tasked to execute applications and respawn COMMAND.COM). So it appears rmod was trying to ask DOS to execute "COMMAND.COM" and the INT 21h,4B00h call failed with error 3 -- "path not found". Pretty strange. By any chance have you tested this with the FD kernel? In any case, I will look into this in depth tomorrow.

Also occurs under FreeDOS kernel, albeit with a slightly different error message.
Groß (IMG_0371)

I am unable to reproduce the problem. Neither on VirtualBox, nor on real hw. I've also installed qemu and try with it - but still no luck. Could you please test this floppy.zip image and tell me if it fails with your qemu? This is what I am getting (QEMU 7.1.0):

image

and this is how I am launching qemu:
qemu-system-i386 -fda floppy.img -vga cirrus -m 4M -cpu 486 -machine isapc -no-hpet

Your original FLOPPY.IMG does not break, BUT it can be made so. I altered the autoexec.bat by a few lines. If the line SET XX=ABC is placed after the FDAPM call, this works again?!? Also, COMSPEC is not set if FDAPM is commented out, but set after it is run. Is this intended?

FLOPPY.IMG.zip

BTW this is becoming a really interesting one :)

Thanks, the SET XX=ABC trick made it indeed - success! (sort of)
Quite esoteric stuff. It's also interesting to press CTRL+C again after the fatal ERR is displayed.

image

I will try to fiddle with this.

Indeed FDAPM can be replaced by any execution in autoexec, like PKG/?. Same effect.

Yes, this is because COMSPEC is being only set before an external command is to be launched. Not really an intended behavior, but harmless enough that I never noticed it. Will see to improve this as well.

Yes, this is because COMSPEC is being only set before an external command is to be launched. Not really an intended behavior, but harmless enough that I never noticed it. Will see to improve this as well.

This is actually a limitation of EDR. When EDR spawns the shell, its environment is empty (non-existent, really), so SvarCOM does not know who he is (no "EXEPATH" variable past the environment). Only when SvarCOM is respawned it can learn its location. The only workaround I can think of is for SvarCOM to ask DOS for the boot drive (INT 21h,3305h) and naively assume its named COMMAND.COM and lies in the root drive, but this will be wonky. This is actually what RMOD (the resident part of SvarCOM) does when it has no COMSPEC to rely on... And it's the reason why the wrong shell is being respawned under EDR when SvarCOM is not under C:\COMMAND.COM. The problem is not present in FreeDOS since the FD kernel provides a proper stub environment to its shell.

So I fixed (I think) two things. None of them being related to the CTRL+C handler, nor to any TSR.

First, the main culprit: SvarCOM provides its resident code (RMOD) with the offset of the COMSPEC value within the environment segment so RMOD does not have to parse the environment on its own. This offset was being updated AFTER the user input a command, so if SvarCOM was killed through a CTRL+C the COMSPEC offset was potentially pointing at some weird location. Fixed simply by updating this RMOD offset earlier.

Second, which was not the culprit now but could still lead to troubles in the future: when computing the offset of COMSPEC I was first generating a far pointer to its value and then providing RMOD with FP_OFF(fp_comspec_value). So then RMOD had to use the segment of the environment and add to it this FP_OFF value. I think this could fail because the far pointer might just as well be based on a higher segment and hence return an offset that is too low. For example if COMSPEC is at position 16 of the environment, FP_OFF might return 0 because the far pointer assumed an increased segment. I reworked this computation, hopefully for the better.

The result seems to work, at least on my qemu and virtualbox. I hope it's good now:
floppy-fixed.zip

Yes, seems that it does not crash anymore on CTRL+C :)

This is actually a limitation of EDR. When EDR spawns the shell, its environment is empty (non-existent, really)

I could try to store a COMSPEC in the config environment. This could then be transferred into a master environment, probably with other environment variables stored in the config environment.

I could try to store a COMSPEC in the config environment. This could then be transferred into a master environment, probably with other environment variables stored in the config environment.

I was afraid you would suggest this option. :) But yes, that could work. The alternative being the FreeDOS way of providing an environment stub (which could very well point to the current config env). The ugly thing about the config environment is its location in a wild memory area. Might break on INSTALLHIGH / DEVICEHIGH / SHELLHIGH or if any driver or config.sys-loadable TSR will try to allocate high memory.

Then the other minor thing left (for me) to fix is the horrible, terrible crash that occurs when one tries to do a CTRL+C while RMOD is active (typically: because it is displaying the error message "failed to load command.com"). Also, it would be nice if RMOD could provide the exact full path that it tried to execute in such situation.

segment 60h could be used to put this into a defined place below the managed memory, thus relatively immune against overwriting (at least not through dos memory allocations).

so setting PSP environment segment to 60h for process 0 (and of course filling it with sane value).

and leave the config environment thing as it is for backwards compatibilty, so this would be an additional setting. Executable filename should then be appended to the environment at 60h.

all done for SvarCOM. I will fork the config environment bits into a separate EDR issue.

Sadly, the latest SvarCOM build you sent me fails to reload itself if I run it from drive C:. Also, sved hangs on save with this version if booted from C: (physical PC). Everything is fine if I try it from drive A:

The last version I had sent you was buggy, I have fixed a thing or two since then so it is probably fine now. I will check with the current svn version and get back to you.

quickly tested and the current trunk version does work for me, both when booted from C: and A:. can you confirm? attached.
svarcom-2024.4-beta.zip

Yes, works :)