Test for language features
ojwb opened this issue · 3 comments
It would be good to have testing of Snowball language features (especially those not used by any current algorithm implementation) which ran for each target language.
As David Corbett noted in #156, several backends weren't implementing integer division. I've fixed them, but we lack a regression test, and lack automated testing that new backends get this right.
This is the test code I added at the start of stem
in english.sbl
locally to check these fixes worked and that other backends weren't affected:
$p1 = 7
$p1 /= 4
$p1 = 1
$(7 / 4 * 4 == 4)
$p1 = -7
$p1 /= -4
$p1 = 1
$(-7 / -4 * 4 == 4)
$p1 = -7
$p1 /= 4
$p1 = -1
$((-7) / 4 * 4 == -4)
$p1 = 7
$p1 /= -4
$p1 = -1
$(7 / -4 * 4 == -4)
The manual says the pieces of an arithmetic expression have the same semantics as in C, so here are some tests for minint
and maxint
based on the C standard.
$(minint <= -32767)
$(maxint >= 32767)
$(minint + maxint == 0) or $(minint + maxint == -1)
Thanks.
I'd not considered that "C semantics" leads to imposing these requirements on minint
and maxint
, but I think it's helpful to have a defined minimum integer range. In practical terms, stemming algorithms would probably be fine with a signed 8-bit integer even, but sticking with the "C semantics" rule seems good, and it's unlikely that supporting a 16-bit integer would be problematic for any language we're likely to target.
I've just pushed a change that implements compile-time evaluation of numeric subexpressions and tests (mostly as a step towards a longer term plan to track possible ranges for the values of integer and boolean variables, cursor position, length of the current string, slice positions, etc, through the program as there are optimisations we can do based on these). This is relevant here as the division test code above will need revising so we test division semantics of generated code in the target language rather than in the compiler.