dibyendumajumdar/ravi

Reconsider function return-type annotation

skunkiferous opened this issue · 4 comments

AFAIK, function return-type are normally specified in statically-typed language for (at least) 3 reasons:

  1. Documentation for the programmer
  2. Validation of the function implementation, when checking/compiling the function itself
  3. "Optimization" of the code that calls the function, since the compiler knows what the return value type will be.

In the manual, you say the following:

Function return types cannot be annotated because in Lua, functions are un-named values and there is no reliable way for a static analysis of a function call’s return value.

But this only apply to #3. #1 and #2 are still valid reasons to annotate the return type. I already know I would write all my functions like this:

function x(s1: string, s2: string) -- string
  return @string( s1 .. s2 )
end

This is silly. What I really want, is to write them like this:

function x(s1: string, s2: string): string
  return @string( s1 .. s2 )
end

Even if the annotation was not validated at all, just for documentation purpose alone I think it is worth having it.

Yes I agree that it would be good to add it. I just don't know how I can easily enforce this at a call site. But I agree it should be possible to check the function itself / or insert type assertions on return values.

I've achieved some success on trying to implement that.

lparser.c

static void body (LexState *ls, expdesc *e, int ismethod, int line, int deferred) {
  /* body ->  '(' parlist ')' block END */
  FuncState new_fs;
  BlockCnt bl;
  new_fs.f = addprototype(ls);
  new_fs.f->linedefined = line;
  open_func(ls, &new_fs, &bl);
  if (!deferred) {
    checknext(ls, '(');
    if (ismethod) {
      new_localvarliteral(ls, "self");  /* create 'self' parameter */
      adjustlocalvars(ls, 1);
    }
    parlist(ls);
    checknext(ls, ')');
  }
  // here we try to find expected return value(s) types
  if (testnext(ls, ':')) {
    TString *typename = str_checkname(ls);
    char* str = getstr(typename); // our type is here, what to do with it?
    /* maybe we should use this to support typechecking on multiple return values */
    // expdesc e = {.ravi_type_map = RAVI_TM_ANY, .pc = -1};
    // int nexps = explist(ls, &e);
    /* we should also put function (static int explist (LexState *ls, expdesc *v)) behind current function, otherwise compile error */
  }
  
  statlist(ls);
  new_fs.f->lastlinedefined = ls->linenumber;
  check_match(ls, TK_END, TK_FUNCTION, line);
  codeclosure(ls, e, deferred);
  close_func(ls);
}

So, I've successfully parsed return type annotation(s), but have no idea what to do with it next.
I've found some mentions on that in function

  static void ravi_typecheck(LexState *ls, expdesc *v, int *vars, int nvars, int n)
...
        int nrets = GETARG_C(*pc) - 1; /* operand C contains
                                          number of return values expected  */
        /* Note that at this stage nrets is always 1
         * - as Lua patches in the this value for the last
         * function call in a variable declaration statement
         * in adjust_assign and localvar_adjust_assign */
        /* all return values that are going to be assigned
           to typed local vars must be converted to the correct type */
        int i;
        for (i = n; i < (n+nrets); i++)
          /* do we need to convert ? */

Maybe we can check return values somewhere around that.
But we also have to store expected return types somewhere (the ones we got on previous step)

I still poorly understand the internal logic, so maybe @dibyendumajumdar will tell me what to do next?

Thanks

What we need:

  • Syntax specification - including a way to specify multiple values in return statement

Maybe for single return values

function foo() : type 
end

For multiple values:

function foo() : (type1, type2, type3)
end

The parser uses FuncState structure to hold details of the current function being compiled. We probably should add there an array of return types - including the typename for user data types. As we parse we need to store the types in the FuncState structure.

  • The type checks need to be added to the return op code.

This can be done in luaK_ret where we generate OP_RETURN. We need to go through all of the values and check that types match the function signature and generate TO* opcodes against each register when we cannot check at compile time that the type is correct.

Note that some return values may be constants, so we cannot do any conversions when this is true - or we need to move the constant to a temp register before generating TOINT/TOFLT opcodes.

function foo(): (integer, integer)
  return 4,2
end

The proposal above validates the return values in the return statement.

There is no change at callsite. This is harder problem because at callsite we only know at runtime what function we will be calling. We will probably need to store the type signature in Proto structure if any runtime callsite optimzation/checks need to be done. For now, this should be out of scope.