*@a and *(@a) give different results for byte-sized variables
leanto opened this issue · 7 comments
I was just experimenting with different addressing and I discovered that if a
is declared as a byte, *@a
and *(@a)
give different results:
include "inc/cmdsys.plh"
byte a
byte b
a = 5
b = 6
puti(*@a) // 5
putln()
puti(*(@a)) // 1541
putln()
Specifically, it appears that *@a
remembers that a
is a byte while *(@a)
does not.
Is this intended behavior?
Following up on this, I tried the same test with *@x
when x
is declared as a word:
include "inc/cmdsys.plh"
word x
x = $5566
puti(^@x) // 85
putln()
puti(^(@x)) // 85
putln()
puti(*@x) // 21845
Here, ^@x
does not "remember" that x
is a word, so I think the behavior with byte values is a bug.
Yep, it looks like types don't get tracked properly through the expression. The '*' operator should have precedence and return a word. I'll into where the '@' kept the byte type when it shouldn't have.
Thanks for the great sample code, too!
Dave...
For the fun of it, I tried to figure out what the problem is. Here's what I managed to figure out:
After puti
gets parsed as a function call, parse_list
gets invoked to parse all of the arguments. This in turn calls parse_expr
which calls parse_value
. parse_value
is where things get interesting.
The first time through the while (scan()) {...}
loop, scantoken == WPTR_TOKEN
, causing the if (scantoken == BPTR_TOKEN || scantoken == WPTR_TOKEN)
block to run. Here type = WPTR_TYPE
and deref = 2
. Then the loop starts again.
The second time through the while (scan()) {...}
loop, scantoken == AT_TOKEN
and deref = 1
.
The third time through the while (scan()) {...}
loop, scantoken == ID_TOKEN
. In that block, tokenstr = "a) // 5"
and tokenlen = 1
, meaning that the token is "a"
, as expected. The result of id_type(tokenstr, tokenlen) = BYTE_TYPE
, giving type = 260
.
This causes flow to fall into the else
block where value = 27 = CONST_TYPE | WORD_TYPE | ASM_TYPE | DEF_TYPE
. This causes valseq
to be set a t_opseq
that will load the global address of a
. This is fine.
The fourth time through the while (scan()) {...}
loop, scantoken == CLOSE_PAREN_TOKEN
, causing the loop to exit. The while (deref > rvalue) {...}
loop is skipped. Then flow enters if (deref) {...}
.
At this point we have this block wherein the problem lies:
if (type & FUNC_TYPE)
{ ... }
else if (type & (BYTE_TYPE | BPTR_TYPE))
valseq = gen_lb(valseq);
else if (type & (WORD_TYPE | WPTR_TYPE))
valseq = gen_lw(valseq);
else
parse_error("What are we dereferencing?");
Because type = BYTE_TYPE | WPTR_TYPE
, this causes the LB
instruction to be generated instead of LW
. Were the if
clauses inverted, it would generate LW
correctly, but ^@word_var
would also generate LW
, which is also wrong.
Perhaps the interplay of the WORD_TYPE
and BYTE_TYPE
with WPTR_TYPE
and BPTR_TYPE
can be fixed here. If I understand correctly, the WORD_TYPE
and BYTE_TYPE
can only come from the symbol that's being operated on, but the WPTR_TYPE
and BPTR_TYPE
are the type of dereference that's actually happening. So maybe something like:
else if (type & (BPTR_TYPE))
valseq = gen_lb(valseq);
else if (type & (WPTR_TYPE))
valseq = gen_lw(valseq);
else if (type & (BYTE_TYPE)
valseq = gen_lb(valseq);
else if (type & (WORD_TYPE))
valseq = gen_lw(valseq);
would work because it gives preference to the pointer type.
However, maybe this is the wrong approach because, in actually tracing through, it seems like *@a == @*a
which seems even more wrong. And this test shows that it's so:
include "inc/cmdsys.plh"
byte a
byte b
a = 5
b = 6
puti(*@a) // 5
putln()
puti(*(@a)) // 1541
putln()
puti(@*a) // 5
putln()
done
Perhaps the interplay of the WORD_TYPE and BYTE_TYPE with WPTR_TYPE and BPTR_TYPE can be fixed here. If I understand correctly, the WORD_TYPE and BYTE_TYPE can only come from the symbol that's being operated on, but the WPTR_TYPE and BPTR_TYPE are the type of dereference that's actually happening. So maybe something like:
else if (type & (BPTR_TYPE)) valseq = gen_lb(valseq); else if (type & (WPTR_TYPE)) valseq = gen_lw(valseq); else if (type & (BYTE_TYPE) valseq = gen_lb(valseq); else if (type & (WORD_TYPE)) valseq = gen_lw(valseq);
would work because it gives preference to the pointer type.
This is probably the easiest approach without rewriting the prefix operator handling. I've gone back and forth on this over the years and keep finding situations where it breaks down.
However, maybe this is the wrong approach because, in actually tracing through, it seems like *@A == @*a which seems even more wrong. And this test shows that it's so:
include "inc/cmdsys.plh" byte a byte b a = 5 b = 6 puti(*@a) // 5 putln() puti(*(@a)) // 1541 putln() puti(@*a) // 5 putln() done
Note that @*a is non-sensical and should throw an error (but currently doesn't). You can only take the address of a variable/function, not an expression or intermediate value.
@dschmenk, I think this can be closed. (Unless you want to use this for fixing the prefix operator parsing, that is.)
I should at least add warnings/errors for non-sensical order of pre-ops.