freemint/teradesk

Crash if a .hyp file is dragged to hyp_view.app (MP)

xdelatour opened this issue · 18 comments

Steps to reproduce (ARAnyM snapshot):

  • Run runme.sh from latest snapshot (1-19-aa3)
  • Open two windows : c:\guides\ and u:\opt\GEM\hypview\
  • Drag a .hyp file to hyp_view.app
  • HypView opens its window but TeraDesk is killed

No problem without memory protection (by replacing aranym-mmu with aranym in runme.sh)

Such crashes typically occur when sending the program a VA_START message, but that will only happen when the program is already running. Is that the case?

Also, the code in teradesk looks quite ok to me:

teradesk/main.c

Lines 1258 to 1261 in 0b3356f

#if _MINT_
if (mint)
mode |= 0x40;
#endif

There may be only two problems:

  • Are you maybe using the TOS only version of teradesk? In that case, Mxalloc for the protection will not be used.
  • The memory is allocated global-readable, but not writable. Does hyp_view maybe try to write to it? Does the crash also occurr when using other programs as viewer, eg. St-Guide?

Edit: thinking about it again: if something is wrong with the protection, then it should be hyp_view which is killed, not teradesk. Strange. In any case, can you check the above cases? Maybe hyp_view is sending back something that teradesk does not like.

Hmm, if Xavier uses the aranym snapshot, that uses the "non-S" version as the desktop:

https://github.com/freemint/freemint/blob/aa307992c6cfa355e79961077ef587e4960a9bda/.scripts/util.sh#L319

So perhaps your second guess is closer to truth.

I made no changes to aranym-1-19-aa3.

  1. TeraDesk is the multitask version
  2. Same result with ST-Guide (V10.12.96)
  • hyp_view.app is not running at startup (checked in task manager)

Actually, I'm working on shel_help() with same issue when MP is enabled. I thought the bug was mine. It's why I tested an unmodified snapshot.
My current patch:
shel_help_v4.patch.txt

Some progress... Seems related to the AV_PROTOKOLL/VA_PROTOSTATUS handshake and this behaviour : https://github.com/freemint/hypview/blob/master/doc/history.txt#L365-L370

pid   8 (desktop): (Pid 8)evnt_multi for '  Tera Desktop', evnt=KBD|BUT|MSG, clks=0x102, msk=0x3, bst=0x0 T:-1
pid   8 (desktop): (Pid 8)status 4000, 0, C.redraws 0
pid   8 (desktop): (Pid 8) ~ get_mbstate:  md: clicks=-1, head=4D7D82A, tail=4D7D82A, end=4D7D87A
pid   8 (desktop): (Pid 8) ~ check_mouse - return 0, 732.339 for   Tera Desktop
pid   8 (desktop): (Pid 8) ~ Got pending message AV_PROTOKOLL for '  Tera Desktop' from 9 - 22,0,0,315,-24576
pid   8 (desktop): (Pid 8) ~ still_button multi?? o[0,2] 102,0, lock 0, Mbase 0, active.widg 0
pid   8 (desktop): (Pid 8) ~  -=- md: clicks=-1, head=4D7D82A, tail=4D7D82A, end=4D7D87A
pid   8 (desktop): (Pid 8) ~ [1]is_bevent? No; gotb 0; gotc 0; clks 0x102, msk 3, st 0
pid   8 (desktop): (Pid 8)check_queued_events for '  Tera Desktop', evnt=MSG, clks=0x102, msk=0x3, bst=0x0 T:-1
pid   8 (desktop): (Pid 8)status 4000, 0, C.redraws 0
pid   8 (desktop): (Pid 8) -- 10, 2DC, 153, 0, 0, 0, 0
pid   8 (desktop): (Pid 8) ~ [222]cancel_evnt_multi for '  Tera Desktop'
pid   8 (desktop): (Pid 8) evnt_multi[25] returned 16 for desktop
pid   8 (desktop): (Pid 8)Leaving AES   Tera Desktop
pid   8 (desktop): xaaes_on_signal for 8 (signal 10)
pid   8 (desktop): info_on_exit for 8 (signal 2560)
pid   8 (desktop): xaaes_on_exit event for 8 (2560)
pid   8 (desktop): (Pid 8) ~ exit_client: '  Tera Desktop'
pid   8 (desktop): (Pid 8)cancel_CE: find function 4F07762 in client events for   XaSYS
pid   8 (desktop): (Pid 8)cancel_CE: find function 4EFEFAA in client events for   XaSYS
pid   8 (desktop): (Pid 8) ~ swap_menu:   Tera Desktop, flags=4
pid   8 (desktop): (Pid 8) ~ exit ok
pid   8 (desktop): pid2client(4) -> p 4EA4000 client   XaAES(dbg) v1.6.4 Beta (m68020-060)
pid   8 (desktop): (Pid 8)added cevnt 4D87898(1) (head 4D87898, tail 4D87898) for   XaAES(dbg) v1.6.4 Beta (m68020-060)
pid   8 (desktop): (Pid 8)[3]Unblocked AESSYS   0x1
pid   8 (desktop): Sent CH_EXIT (premature client exit) to (pid 4)  XaAES(dbg) v1.6.4 Beta (m68020-060) for (pid 8)desktop
pid   8 (desktop): (Pid 8)remove_all_windows for '  Tera Desktop'
[...]
pid   8 (desktop): (Pid 8) ~ client exit done
pid   8 (desktop): info_release: releasing 0x4D7DAB4
pid   8 (desktop): xaaes_release: 4D7D044

Hm, that could only be related to the server name in the AV_PROTOKOLL message. But that one should be allocated here https://github.com/freemint/hypview/blob/ce849e69f6b6f73a364a4ae9b5498f30c64dda49/dl_av.c#L111

And i can't see anything wrong with that.

Only problem i can see is

teradesk/va.c

Line 822 in 2660eb5

strcpy(avwork.name, pp6);
where teradesk is blindly copying the returned client name (i'm not sure whether all applications really pass that information). But hypview does, so maybe that is not causing the error here.

I don't understand pp6 = *(const char *const *) (message + 6);
What happens if Hyp_View sends multiple AV_PROTOKOLL with different pointers?

My logfile is about 90000 lines, the above snippet is the second handshake (Teradesk is killed). The first handhsake is successful.

Other possibility, Hyp_View sends two AV_PROTOKOLL with the same pointer (at first launch, as application via appl_write - not accessory and after wind_open), the first VA_PROTOSTATUS implies the release of the memory, and the second may crash

I don't understand pp6 = *(const char *const *) (message + 6);

message[6/7] contain a pointer the clients application name. That construct fetches that as a pointer.

What happens if Hyp_View sends multiple AV_PROTOKOLL with different pointers?

That should not happen. AV_PROTOKOLL is send by hypview only once when initializing the AV_PROTOKOLL. But even if it does, the returned name is copied by teradesk, so worst thing that could happen is that hypview is registered twice.

Other possibility, Hyp_View sends two AV_PROTOKOLL with the same pointer (at first launch, as application via appl_write - not accessory and after wind_open), the first VA_PROTOSTATUS implies the release of the memory, and the second may crash

Hm. Normally, the order should be:

  • hypview sends AV_PROTOKOLL, in dl_init.c
  • teradesk gets the message, registers hypview, and answers with VA_PROTOSTATUS.
  • hypview receives VA_PROTOSTATUS, and frees the av_name used for AV_PROTOKOLL.
  • in window.c, when opening a window, DoAV_PROTOKOLL() should not be called again, because server_cfg is set from the VA_PROTOSTATUS message. But even if it is called again, av_name will be allocated again.

I can't think of any scenario where that would fail.

I guess your logfile is from XaAES? Maybe adding debug output to hypview & teradesk instead, to reduce the clutter. Then you can also see where the different messages are being sent from.

Thanks for your feedback.
Indeed, I'm running xaaesdeb.km

Maybe I'm wrong but I think AV_PROTOKOLL is sent twice and the first VA_PROTOSTATUS is received later.

  • appl_write(hyp_view), hyp_view sends AV_PROTOKOLL
  • hyp_view opens its window, it is not aware of av server yet, because it's still waiting for VA_PROTOSTATUS. Thus, it sends another AV_PROTOKOLL with the same pointer
  • the desktop reads the AV_PROTOKOLL message and replies with VA_PROTOSTATUS
  • hyp_view releases av_name
  • the desktop reads the other AV_PROTOKOLL and tries to copy the name but the memory block is no more valid

If I skip Mfree(av_name), all seem to work fine, there is no crash (but av_name is never released, that's bad).

Hm, i think that could be fixed by not sending another AV_PROTOKOLL if av_name is still set (if i did not miss anything, then it is only used for that purpose). That might fix the crash, but there will still be a communication problem, ie. hypview and/or teradesk will not know whether the other side understands quoting.

I also wonder whether that 2nd call in wind_open is needed at all. The first one should always be executed, even if the reply was not received yet.

I tried to add if (server_cfg) return; in DoAV_PROTOKOLL() but this doesn't work because av_name server_cfg is still null. It will be set later by DoVA_PROTOSTATUS.

The second call, in wind_open, is sent because the first is sent only if hyp_view is launched as application here.

If (application)

  • DoAV_PROTOKOLL
  • ...
  • DoAV_EXIT

If (accessory)

  • DoAV_PROTOKOLL at the first (wind_open) (only called once, supposing there is no further wind_open before VA_PROTOSTATUS)

[EDIT] server_cfg, not av_name

Maybe we can keep av_name all the time and only release it when exiting the application? (by moving Mfree from Do_VA_PROTOSTATUS to Do_AV_EXIT)

This implies that av_name will never be released when Hyp_View runs as a accessory.

In multi-tasking environments, there aren't any really accessories (in the sense of single-tos). And in single-tos, that Malloc()/Mfree() business is broken anyway, since ACCs aren't GEMDOS processes, and Malloc is done on behalf of the main application.

That means, for Single-TOS that part would have to be rewritten anyway (but that also applies to other allocations). The memory you got for av_name will be gone once the desktop starts another application, no matter whether you call Mfree or not. But i think we can just ignore that for now.

But we still should avoid sending AV_PROTOKOLL more than once.

But we still should avoid sending AV_PROTOKOLL more than once.

Is this change able to fix it?

  • Change: if(!server_cfg) (window.c)
  • To: if(!server_cfg && !av_name) (with extern char *av_name;)

Hence, AV_PROTOKOLL is called only if both server_cfg and av_name are null. If server_cfg is null but av_name is not null, it means Hyp_View is waiting for a VA_PROTOSTATUS

Yes, that's what i meant some posts above, i think that might work. But i cannot currently test it. And maybe move that test into DoAV_PROTOKOLL(), so av_name doesn't have to be exported.

Very small patch, I hope it can solve the issue.

I made only very few tests and it seems to work.

Unfortunately I found another one: when launched as accessory, a .hyp can be opened from the Desk menu (via file selector), others .hyp can be loaded by dragging to the window. But after closing the window, a new click in the Desk menu kills Hyp_View (with and without the above patch. Checked with unmodified snapshot freemint-1-19-8ea9d1ab-040-aranym.zip), I just changed hyp_view.app to hyp_view.acc. This happens only with memory protection enabled. Without MP, the window is reopened.

dl_av.c.patch.txt

So what about this one? Commit or wait for a more robust fix?

If Thorsten agrees I think we can commit (if the variable is not null, it means that we are still waiting for the response, so do not send a second message)