Makopo/lslint

Global variables initialized with negative values throws an error

Closed this issue · 14 comments

Take the following code sample:

integer user_configured_mode = -1;
float offset_from_start = -1.0;
vector position_delta = <1.0,1.0,-1.0>;
default
{
	state_entry()
	{
		llOwnerSay((string)user_configured_mode);
		llOwnerSay((string)offset_from_start);
		llOwnerSay((string)position_delta);
	}
}

This code sample will produce the following errors:

SublimeLinter: lslint: constant_test.lsl ['G:\\devel\\secondlife\\tools\\lslint\\binary\\windows64\\lslint.exe'] 
SublimeLinter: lslint output:
ERROR:: (  1, 32): Global initialize must be constant.
ERROR:: (  2, 27): Global initialize must be constant.
ERROR:: (  3, 34): syntax error, unexpected '-'
ERROR:: (  8, 22): `user_configured_mode' is undeclared.
ERROR:: (  9, 22): `offset_from_start' is undeclared.
ERROR:: ( 10, 22): `position_delta' is undeclared.
TOTAL:: Errors: 6  Warnings: 0 

I have reproduced on all 3 windows binaries (windows, windows32, windows64).

I do not believe this error to be justified as the in-world compiler produces working code without complaint.

I attempted to suppress E_GLOBAL_INITIALIZER_NOT_CONSTANT myself, but I end up with unexpected '-' which doesn't help me much.
I am not too familiar with the non-C parts of LSLint so unfortunately I cannot figure this one alone.

Sorry, my fix for the sign issues was bogus, as it introduced this problem. A temporary solution until I find how to fix it for good, is this:

diff --git a/lslmini.y b/lslmini.y
index 43920ae..8d10c52 100644
--- a/lslmini.y
+++ b/lslmini.y
@@ -305,11 +305,13 @@ global_variable
 	{
     $$ = new LLScriptGlobalVariable($1, $3);
 	}
+/*
     | name_type '=' expression ';'
     {
     ERROR(&@3, E_GLOBAL_INITIALIZER_NOT_CONSTANT);
     $$ = NULL;
     }
+*/
     | name_type '=' error ';'
     {
     $$ = NULL;
@@ -347,6 +349,10 @@ constant
 	{
     $$ = new LLScriptIntegerConstant($1);
 	}
+	| '-' INTEGER_CONSTANT
+	{
+    $$ = new LLScriptIntegerConstant(-$2);
+	}
 	| INTEGER_TRUE																	
 	{
     $$ = new LLScriptIntegerConstant($1);
@@ -359,6 +365,10 @@ constant
 	{
     $$ = new LLScriptFloatConstant($1);
 	}
+	| '-' FP_CONSTANT
+	{
+    $$ = new LLScriptFloatConstant(-$2);
+	}
 	| STRING_CONSTANT
 	{
     $$ = new LLScriptStringConstant($1);

EDIT: I would appreciate help with how to coerce the simple_expression rule to take precedence over the expression rule, which is what forced me to comment out the E_GLOBAL_INITIALIZER_NOT_CONSTANT error generation.

This is somewhat hackish but it works as expected:

diff --git a/logger.cc b/logger.cc
index 7678b9f..5cf7f4a 100644
--- a/logger.cc
+++ b/logger.cc
@@ -258,7 +258,7 @@ const char *Logger::error_messages[] = {
    "Returning a %s value from a %s function.",
    "Not all code paths return a value.",
    "%s", // Syntax error, bison includes all the info.
-   "Global initialize must be constant.",
+   "Global initializer must be constant.",
    "State must have at least one event handler.",
    "`%s' is a constant and cannot be used as an lvalue.",
    "`%s' is a constant and cannot be used in a variable declaration.",
diff --git a/lslmini.y b/lslmini.y
index 43920ae..87397d1 100644
--- a/lslmini.y
+++ b/lslmini.y
@@ -301,6 +301,14 @@ global_variable
 	{
     $$ = new LLScriptGlobalVariable($1, NULL);
 	}
+	| name_type '=' '-' INTEGER_CONSTANT ';'
+	{
+    $$ = new LLScriptGlobalVariable($1, new LLScriptSimpleAssignable(new LLScriptIntegerConstant(-$4)));
+	}
+	| name_type '=' '-' FP_CONSTANT ';'
+	{
+    $$ = new LLScriptGlobalVariable($1, new LLScriptSimpleAssignable(new LLScriptFloatConstant(-$4)));
+	}
 	| name_type '=' simple_assignable ';'
 	{
     $$ = new LLScriptGlobalVariable($1, $3);
@@ -347,6 +355,10 @@ constant
 	{
     $$ = new LLScriptIntegerConstant($1);
 	}
+	| '-' INTEGER_CONSTANT
+	{
+    $$ = new LLScriptIntegerConstant(-$2);
+	}
 	| INTEGER_TRUE																	
 	{
     $$ = new LLScriptIntegerConstant($1);
@@ -359,6 +371,10 @@ constant
 	{
     $$ = new LLScriptFloatConstant($1);
 	}
+	| '-' FP_CONSTANT
+	{
+    $$ = new LLScriptIntegerConstant(-$2);
+	}
 	| STRING_CONSTANT
 	{
     $$ = new LLScriptStringConstant($1);

Test case:

integer a = -3;
float b = -1.5;
float c = -a;
default{timer(){
b = (float)-2 + a;
}}

Before the patch:

ERROR:: (  1, 13): Global initialize must be constant.
ERROR:: (  2, 11): Global initialize must be constant.
ERROR:: (  3, 11): Global initialize must be constant.
ERROR:: (  5, 12): syntax error, unexpected '-'
TOTAL:: Errors: 4  Warnings: 0

After the patch:

ERROR:: (  3, 11): Global initializer must be constant.
TOTAL:: Errors: 1  Warnings: 0

Thank you for reporting @XenHat and suggesting fixes @Sei-Lisa .
I created the branch fixing#6 and put the binaries into it.
Please try ahead, thanks.

Short version: Please put back these lines in the patch:

     | name_type '=' expression ';'
     {
     ERROR(&@3, E_GLOBAL_INITIALIZER_NOT_CONSTANT);
     $$ = NULL;
     }

Long version:

The first patch was a temporary solution that removed the error, because bison for some reason preferred to take the expression branch whenever a minus sign was present.

The second patch totally substituted the first. Its purpose was to address this problem and keep the error message, at the cost of introducing even more ambiguity and redundancy in the grammar.

However, the commit above removes the error. When tested against the test case above, it just says "Syntax error", which is much less explanatory than "Global initializer must be constant".

So, please re-add the error message.

I addressed above. Thank you.

Nice. I'll try that as soon as possible.

image
👌

@Sei-Lisa @Makopo

While we're on the topic of fixing parsing stuff, would it be possible to implement Makopo/sublime-text-lsl#36 ? We'd have to make lslint ignore (or even better safety-check) the preprocessor format (available at http://wiki.phoenixviewer.com/fs_preprocessor , at least that's the implementation I use right now).
I have tried and failed, I don't understand BISON.

I merged #6 into master. Please re-clone master branch to see the change since I have deleted all the binaries from repo history.

Everything seems to be in order, this solves the issue for me.

Unfortunately, this is not fully fixed: vectors and rotations in globals can't have negative components. Please reopen.

I've pushed a fix to a branch in my repository, Sei-Lisa/lslint@aac0941

BEGIN tl;dr BLOCK ABOUT INTERNALS

It's very invasive. The old approach was to parse the simple expressions (simple_assignable in the grammar) that are accepted in the globals in the same way as in the original LSL, and then add a rule afterwards that would catch all other expressions. But that made the grammar ambiguous, and bison didn't always do the right thing, choosing the expression branch that emits the error whenever there was a minus sign somewhere.

The new approach is to just parse the global as a general expression, removing the simple_assignable and associated rules completely (except constant, which is in use in another part of the grammar), and then check if the resulting tree is "simple"; if it is, build the LLScriptSimpleAssignable nodes from the expression's nodes; if it's not, emit an error. Extreme care had to be put in ensuring that the new format accepts and rejects the same grammar as the SL compiler, as the new test case shows. I think it leaks when it abandons the expression's nodes - there's room for improvement there.

END tl;dr BLOCK ABOUT INTERNALS

The worst part of it is that it reads builtins.txt twice. The reason is that it's normally read at the time LLScriptScript is created, yet that only happens on successful parsing of the whole script. The problem with that approach is that the parser does need the built-ins, because the syntax in some cases depends on whether a token is a constant or a user identifier. For example, you can do (string)-PI but not (string)-myvar. The patch also fixes that case, which was broken. (There's something still broken, namely creating a local variable with the name of an event is currently accepted. That's in my to-do list.)

So, before i send a PR, I'd like to see opinions on what to do about the double read. I guess that the right fix could be something like this:

  • builtins.cc currently uses the LLScriptScript class to put the result in; make it independent somehow instead. Ideas on where to put the symbols are welcome. Only the symbol table is really necessary. Wherever it ends, the parser needs to be able to read it. The patch puts it within a global variable g_script. This step is not strictly necessary, but it feels compelling.
  • Add some mechanism for copying the symbol table from wherever builtins.cc will put it, to the LLScriptScript instance. No idea how yet.

An alternative would be to create the LLScriptScript instance at the beginning of the parsing, perhaps in a variable available to the parser, rather than at the end, and set the missing fields (the globals and the states) at the end.

Any thoughts?

The approach I tend to use is to create getter and setter functions for such "passing", but a global is probably good enough for the size of this project. I can't really pronounce myself on that kind of complex internals for reasons explained before. Whatever works, I suppose.

I've opted for the alternative approach, because even though it's less powerful, it was less invasive because the global already existed. The drawback is that every time a script is parsed, builtins.txt are read again, but I don't foresee lslint parsing a script more than once.