glycerine/zygomys

Anonymous function inside of a package loses its scope

bmatsuo opened this issue · 16 comments

My team has been experimenting with the use of packages to organize some code. But there appears to be an issue where anonymous functions cannot see functions defined within the same package.

The following one-liner demonstrates the problem.

(letseq [mypkg (package mypkg (begin
            (defn Double [x]
                (+ x x))

            (defn DoubleAll [xs]
                (begin
                    (println (Double 10))
                    (map (fn [x] (Double x)) xs)))))]
        (println (mypkg.DoubleAll ^(1 2 3))))

Pasting the above into the repl produces the following output:

20
error in __anon256:3: Error calling 'infix': Error calling 'map': symbol `Double` not found
in map:-1
in DoubleAll:9
in evalGeneratedFunction:20
in infix:0
in __main:10

The symbol Double is resolved in the expression (println (Double 10)) but it cannot be resolved in the anonymous function (fn [x] (Double x)) which appears on the following line.

Is this somehow desired behavior?

The workaround we have found is to qualify the symbol using the package name inside the anonymous function (i.e. (fn [x] (mypkg.Double x))). This seems to work but it is not ideal because it seems we are restricted to calling public/exported package functions within anonymous functions.

Hi @bmatsuo, thanks for mentioning this.

Is this somehow desired behavior?

Heh. You're so polite. I appreciate that.

No, I think its just a bug.

I'm not sure off hand why that would be happening.

The main lookup code is in

func (env *Zlisp) LexicalLookupSymbol(sym *SexpSymbol, setVal *Sexp) (Sexp, error, *Scope) {

and the rules are discussed just inside,

// (1) first go up the linearstack (runtime stack) until

This was quite delicate code, and packages were a much more recent addition. So I can imagine there is, as you've noted, interaction between them that's not anticipated.

If you would be so kind as to work up a Pull request, I would be happy to review it. I'm fairly busy and probably won't get to this near term.

You've already got a test, which is the first part. Just put it into a .zy file in the tests directory, and change it so it is failing now.

I'll take a look over things and see what I can find. Though in general I don't know much about lisp. I didn't ask that stupid question to be flippant. I really don't know how things are supposed to work half of the time. :)

:-) no worries. It's just supposed to be straight lexical scoping. Or, "what you see on the page is what you get".

Well, I'm still rather confused after looking through the code. I developed enough intuition about how things are broken to arrive at this problem invoking anonymous functions:

((let [x 123] (fn [] ((fn [] x)))))

The let returns a function which calls an anonymous function referencing x. And zygo is not able to resolve that symbol. racket was able to handle the equivalent(?) statement expression so I think this example is legit.

((let ([x 123]) (lambda [] ((lambda [] x)))))

Where I am most confused is that it seems like there is supposed to be a recursive quality to the lexical lookup and I can't see that in the Zlisp.LexicalLookupSymbol. AFACT there is a disconnect between the code comments and the code. Obviously that divergence is natural. But I am confused by these comments.

// (1) first go up the linearstack (runtime stack) until
// we get to the first (user-defined) function boundary; this gives
// us actual arg bindings and any lets/newScopes
// present at closure definition time.
// (2) check the env.curfunc.closedOverScopes; it has a full
// copy of the runtime linearstack at definition time.
// If not found there, (say because we're in a '+' function),
// go up the linearstack until we hit a user defined function boundary,
// which will have captured in its closure a set of relevant
// bindings
.

The section I have put in bold is not implemented, from what I have groked out of the code. There is code to handle the first part of (2). But where is no fallback when the symbol isn't found in the dynamic scope lookup that is env.curfunc.ClosingLookupSymbol?

Am I totally off here?

Not solved yet. Notes (debugging aloud here)...

https://github.com/glycerine/zygomys/blob/master/zygo/scopes.go#L142

is where the search (is supposed to) go up the stack

Part of what makes things tricky is that it generates byte code that creates scopes when run. This dynamic is necessitate but the runtime-created nature of anonymous functions.

434dad3

Using .debug at the command prompt to turn on debugging. You can place (_ls) to see the scope stack at any point.

working just with this small example, x is lost

> (def res (let [x 123] (fn [] ((fn [] x)))))
> (res)
error in __anon252:1: Error calling 'infix': symbol `x` not found
in __anon251:3
in evalGeneratedFunction:1
in infix:0
in __main:13

stack capture happens in closure.go, correction: closing.go.

I can see that the innermost closure isn't capturing a stack that has x on it, oddly.

 NewClosing is cloning the linear stack: '  
     elem 0 of :  global
         (global scope - omitting content for brevity)
     elem 1 of :  scope Name: 'runtime let'
         x -> 123
 )'.

+++ CreateClosure: assign to '(fn [] ((fn [] x)))' the stack:

         closedOverScopes of '__anon268'
             elem 0 of closedOverScopes of '__anon268':  global
                 (global scope - omitting content for brevity)
             elem 1 of closedOverScopes of '__anon268':  scope Name: 'runtime let'
                 x -> 123


222 CreateClosure: top of NewClosing Scope has addr 0xc4201a8a80 and is

 NewClosing is cloning the linear stack: '  
     elem 0 of :  global
         (global scope - omitting content for brevity)
     elem 1 of :  __anon268 at pc=0
         empty-scope: no symbols
 )'.

+++ CreateClosure: assign to '(fn [] x)' the stack:

         closedOverScopes of '__anon269'
             elem 0 of closedOverScopes of '__anon269':  global
                 (global scope - omitting content for brevity)
             elem 1 of closedOverScopes of '__anon269':  __anon268 at pc=0
                 empty-scope: no symbols


222 CreateClosure: top of NewClosing Scope has addr 0xc4201a8ae0 and is
error in __anon269:1: symbol `x` not found
in __anon268:3
in __main:16
tests/closure3.zy failed

I pushed the test file that I'm using, for reference https://github.com/glycerine/zygomys/blob/master/tests/closure3.zy

(print (_closdump res)) is useful to dump a closure res's stack

we'll fix this.

But in the meantime, I feel compelled to mention, wouldn't you rather script your project in Go rather than lisp? :-) The static type checking is such a huge boon...

https://github.com/gijit/gi

I think our use cases are different. I don't think gi fits my goals. Anyway..

I see more now where recursion is supposed to occur. Thank you.

Maybe this is a cleaner version of the short example ⛳

(((let [x 123] (fn [] (fn [] x)))))

All function evaluation happens in the outermost expressions.

zygo> f1 = (let [x 123] (fn [] (fn [] x))))
(fn [] (fn [] x))
zygo> f2 = (f1)
(fn [] x)
zygo> (f2)
error in __anon256:1: Error calling 'infix': symbol `x` not found
in evalGeneratedFunction:1
in infix:0
in __main:9
zygo> (print (_closdump f1))
 closedOverScopes of '__anon255'
     elem 0 of closedOverScopes of '__anon255':  global
         (global scope - omitting content for brevity)
     elem 1 of closedOverScopes of '__anon255':  scope Name: 'runtime let'
         x -> 123
zygo> (print (_closdump f2))
 closedOverScopes of '__anon256'
     elem 0 of closedOverScopes of '__anon256':  global
         (global scope - omitting content for brevity)
     elem 1 of closedOverScopes of '__anon256':  __anon255 at pc=0
         empty-scope: no symbols

Just to be clear, it seems like the second dump supposed to show three elements and look like the following:

 closedOverScopes of '__anon256'
     elem 0 of closedOverScopes of '__anon256':  global
         (global scope - omitting content for brevity)
     elem 1 of closedOverScopes of '__anon256':  scope Name: 'runtime let'
         x -> 123
     elem 2 of closedOverScopes of '__anon256':  __anon255 at pc=0
         empty-scope: no symbols

Basically, the second function's closedOverScopes are inherited from the creating function, in addition to having one additional scope containing the parameters for the creating function (in this case an empty scope). Is that right?

Basically, the second function's closedOverScopes are inherited from the creating function, in addition to having one additional scope containing the parameters for the creating function (in this case an empty scope). Is that right?

That is what we desire, yes, but not what we've got. closedOverScopes are (at least, currently) a snapshot of the runtime scope environment at the point at which the function is defined (converted to bytecode).

I'm thinking there needs to be a more explicit tracking of the lexical environment -- to handle these nested function definitions correctly.

closedOverScopes are (at least, currently) a snapshot of the runtime scope environment at the point at which the function is defined

Are you saying closedOverScopes is this just scopes along the function call stack? That is, it is just the set of function parameters and let bindings in the runtime call stack along with global scope?

Are closed over variables for functions below the top of the stack are lost? Because I can't seem to get a handle on them.

I'm thinking there needs to be a more explicit tracking of the lexical environment -- to handle these nested function definitions correctly.

Indeed.

closedOverScopes are (at least, currently) a snapshot of the runtime scope environment at the point at which the function is defined

Are you saying closedOverScopes is this just scopes along the function call stack? That is, it is just the set of function parameters and let bindings in the runtime call stack along with global scope?

That's close. The language "just" seems to imply something obvious. There's subtlety here. Perhaps it bears repeating. The closedOverScopes is a snapshot of the familiar runtime callstack taken at the point of closure definition, and stored with the closure. Since a closure can be defined multiple times in different environments, the context has to be saved each time.

Are closed over variables for functions below the top of the stack are lost? Because I can't seem to get a handle on them.

The closedOverScopes snapshots should be saving references to them. But I'm not sure that is happening fully or correctly with multiple scopes on the stack.

Ideally it works like this:

Consider this nested pair, given names outer and inner for ease of discussion. Same scenario discussed above, only named instead of anonymous.

(defn outer [a]
     (defn inner [] a)
     inner
)

Each time outer is run, a new inner function must be defined, so inner, at definition time, needs to capture the runtime binding to a. Runtime for outer is definition time for inner. Definition time is when a snapshot is saved.

Since outer has (or should have) a scope on the runtime stack at inner definition time, the snapshot of the runtime stack taken at inner definition time should be sufficient to locate outer from within inner.

This is definitely one of the hairiest, most subtle parts. It took a long time to get the nested closure test in tests/closure.zy to work. Things are much simpler in C, where definition time and runtime are most always distinct and not overlapping.

As I said, I don't have time to work on this myself at present.

I think our use cases are different. I don't think gi fits my goals. Anyway..

I prefer go, but If you insist upon a lisp/scheme, chez is a pretty nice one. Here's an embedding in Go I did a couple months back: https://github.com/go-interpreter/chezgo

fixed in v5.0.8. tests/closure3.zy tests for this issue are now green.

@bmatsuo This should be good to go. Let me know if you see anything else.