ZornsLemma/basictool

Nested-PROCs bug in PRES ABE Pack

Opened this issue · 8 comments

lurkio commented

In 8-bit BBC BASIC, everything on a line after a DEF is ignored at run-time unless the named procedure or function is being called. This allows procedures and functions to be nested to use common code, so you can do for example:

1000DEFFNread(start,length,memory):opcode=&8
1010DEFFNwrite(start,length,memory):opcode=&A
1020=FNosword(&72,opcode,start,length,memory)

with no danger of accidentally writing when calling FNread.

But when the above program is run through PRES ABE Pack with the Concatenate option selected, this is the result:

232DEFFNread(start,length,memory):opcode=&8
1010DEFFNwrite(start,length,memory):opcode=&A:=FNosword(&72,opcode,start,length,memory)

So it looks like PRES ABE Pack doesn't respect nested PROCs when Packing with Concatenate on.

Is there any way to make basictool detect this issue and print a warning?

More details: https://stardot.org.uk/forums/viewtopic.php?p=388299#p388299

(Incidentally, I don't know why the first line has been renumbered from 1000 to 232 there, but it vaguely reminds me of another bug (or "feature") in BASIC or PRES ABE Pack that we've discussed in the past... I'll try and dredge up the details... EDIT: Here's your comment, SteveF, from the basictool thread on Stardot: 'Incidentally I see some corruption of the first line number sometimes, but I think that's just an artefact of using "OLD" to recover the program on entry to BASIC when the first line number is >255 and isn't a fundamental problem.' I replied: 'Oh! I wasn't really aware of that bug. But, just for the record, here's another mention of it, in the middle of an amazing thread about all sorts of other BASIC hackery: https://stardot.org.uk/forums/...')

I think the line numbering change isn't a problem, unless I'm missing the point. Here's it being reproduced in b-em on a BBC B with BASIC 2 without ABE Pack being involved at all. Note that 1000 MOD 256 is 232, i.e. the low byte of the first line number is being preserved by OLD but the high byte is lost.
Screenshot at 2023-06-16 21-06-43

It might be possible to work around this by having basictool preserve the high byte of the first line number before invoking ABE Pack and then restoring it afterwards, but I'd be slightly (not very) worried this could introduce strange line numbering problems.

Thinking about how to warn if this nested PROCs bug occurs, it feels like (unlike #7) the issue can only be noticed by examining the original version of the code. The crunched version is not equivalent to the input in your example, but it is potentially "reasonable" in its own right, at least at the level basictool is likely to be able to detect.

This makes me wonder if we need to analyse the pre-crunched program. It doesn't seem that easy though - in this case we need to notice a) that we have a DEF FN which b) has fallen through past another DEF FN c) which will have its return assignment merged onto the line containing the DEF FN. I'm sure this is doable, but it feels potentially error-prone.

I'm just thinking out loud here...

lurkio commented

I think the line numbering change isn't a problem

Agreed.

lurkio commented

Maybe it’s enough to simply detect that there are more DEFPROCs than ENDPROCs (or DEFFNs than =‘s) and then just spit out a warning to say that Pack might have messed some of them up..?

Maybe it’s enough to simply detect that there are more DEFPROCs than ENDPROCs (or DEFFNs than =‘s) and then just spit out a warning to say that Pack might have messed some of them up..?

That's not a bad idea, although I am a bit concerned about the risk of mixing this with things like:

DEF PROCfoo(n)
IF n=5 THEN PRINT "Five!":ENDPROC
PRINT n
ENDPROC

I don't think that's an unusual style - at the very least, I tend to do this sort of thing quite often to work around lack of multi-line IFs in 6502 BBC BASIC.

I do see your suggestion is to warn if there are more DEFPROCs than ENDPROCs, so doing the above in isolation is fine. However, the above code would give us one ENDPROC's worth of "credit" which would stop the warning working as intended, if I haven't got myself confused.

lurkio commented

I do see your suggestion is to warn if there are more DEFPROCs than ENDPROCs, so doing the above in isolation is fine. However, the above code would give us one ENDPROC's worth of "credit" which would stop the warning working as intended

Damn! Hadn’t thought of that! And you can’t necessarily parse the source, line by line, matching up PROCs and ENDPROCs, because that wouldn’t take into account spaghetti GOTOs in the middle of PROCs!

Nightmare.

I think the line numbering change isn't a problem

It might be possible to work around this by having basictool preserve the high byte of the first line number before invoking ABE Pack and then restoring it afterwards

I've had a go at implementing this, as it seemed easy. It's been lightly tested but seems to work. Feel free to give the tagged version https://github.com/ZornsLemma/basictool/releases/tag/v0.11-pre a try (no rush at all).

I do see your suggestion is to warn if there are more DEFPROCs than ENDPROCs, so doing the above in isolation is fine. However, the above code would give us one ENDPROC's worth of "credit" which would stop the warning working as intended

Damn! Hadn’t thought of that! And you can’t necessarily parse the source, line by line, matching up PROCs and ENDPROCs, because that wouldn’t take into account spaghetti GOTOs in the middle of PROCs!

Nightmare.

Agreed. ;-)

I'm not super keen, but perhaps the best we can do is to analyse the original unpacked source - something like:

in_proc = False
for statement in program: # loop over all the statements in the entire program in order
    if statement is "def proc":
        if in_proc and allow_concat:
            warn("Possible nested PROC concatenation risk")
        in_proc = True
    elif statement is "endproc":
        in_proc = False

and similarly for DEF FN and "statement starting with = to return from a function". We would do this only if some kind of "--warn" option is enabled, although just maybe we'd default that to on instead and allow it to be turned off with an option - I'm not sure.

That's off the top of my head, so if I've missed some important detail please do point it out. Don't worry about nitpicking - I'd much rather find problems or better ways now before I implement this. :-)

This will not address the GOTO spaghetti case, but I suspect handling that is not really a practical proposition. What I'm most interested in here is that typical "vaguely reasonable" code doesn't get spurious warnings.

Edit: This is fairly obviously imperfect - e.g. if a procedure has multiple ENDPROCs the warning may not be generated. It's just an attempt to generate a warning in "typical" cases without spamming the user with so many false positives it's annoying and force the user to turn it off. We'd also probably want to report the line number of the "previous" and "nested" DEF PROCs to the user in the warning - the above is just a sketch.

Edit: Would it be sufficient to iterate over the program lines instead of statements? A very quick experiment suggests DEF PROC only works at the start of a line. This would make things a little easier to implement, as it avoids parsing each line to find ":" and worrying about quoted strings and so forth. No, this won't work, will it? DEF PROC might only work at the start of lines, but ENDPROC can occur part-way through a line, as in my example earlier. So we have to work through all the statements.