nengo/nengo-1.4

SimpleNode __init__ methods shouldn't require an odd order

tcstewar opened this issue · 7 comments

Right now, I can't do this:

import nef

class BadNode(nef.SimpleNode):
    def __init__(self, name):
        nef.SimpleNode.__init__(self, name)
        self.data=[0]
    def origin_test(self):
        return self.data

net = nef.Network('Simple Node Test')
net.add(BadNode('bad'))

as this gives the following confusing error:

Traceback (most recent call last):
  File "simplenodetest.py", line 11, in <module>
    net.add(BadNode('bad'))
  File "simplenodetest.py", line 5, in __init__
    nef.SimpleNode.__init__(self, name)
  File "python\nef\simplenode.py", line 153, in __init__
    self.create_origin(name[7:],method)
  File "python\nef\simplenode.py", line 180, in create_origin
    self.addOrigin(SimpleOrigin(name,self,func))
  File "python\nef\simplenode.py", line 60, in __init__
    BasicOrigin.__init__(self,node,name,len(func()),Units.UNK)
  File "simplenodetest.py", line 8, in origin_test
    return self.data
AttributeError: 'BadNode' object has no attribute 'data'

Instead, I have to do

import nef

class GoodNode(nef.SimpleNode):
    def __init__(self, name):
        self.data=[0]
        nef.SimpleNode.__init__(self, name)
    def origin_test(self):
        return self.data

net = nef.Network('Simple Node Test')
net.add(GoodNode('good'))

This is because the nef.SimpleNode.init executes all the origins once in order to find out what dimensionality their returned values are.

If this could be moved to a separate init() method, it'd get rid of this confusing error.

Hmm, on a second look I don't see a nice way to implement this....

Is there a way using __new__? It's guaranteed to run before the constructor. Or perhaps the simulator needs to call some kind of initialize method on Node's that is separate from their constructor? They already call the reset method, maybe setting up the origins and terminations can be done in SimpleNode's reset method.

Hmm, I think I want something that's guaranteed to run after the constructor....

But, I just realized that in all cases of this initialization pattern, you also want it to be reinitialized on reset. And that's what the init() thing is for in the SimpleNode, so what I should have done is this, which works fine:

import nef

class CorrectNode(nef.SimpleNode):
    def init(self):
        self.data=[0]
    def origin_test(self):
        return self.data

net = nef.Network('Simple Node Test')
net.add(CorrectNode('correct'))

This looks much more right to me (and is simpler, too). The only thing it requires is that you do your variable initialization in init() rather than __init__().

Is this good enough?

I'm not sure; it seems very unintuitive / un-Pythonic to ask people to not write an __init__ function, and you'd have to let init take arbitrary arguments. It seems like, if you need to run something after __init__, it should be reorganized. What about explicitly denoting the functions that are origins and terminations? You could very easily have a decorator to do this. How about:

import nef
from nef.decorators import origin

class CorrectNode(nef.SimpleNode):
    def __init__(self):
        self.data = [0]

    @origin
    def test(self):
        return self.data

You can even have the decorator accept the dimensionality as an argument. This seems a lot cleaner and less magical.

I could definitely get onboard with switching it over to decorators, but I'm not sure that changes anything about this problem. I could certainly see doing something like

    @nef.simplenode.origin(dimensions=1)
    def output(self):
        return self.data

As that's arguably fair bit clearer than this alternative, with its abuse of the default parameter setting thing (which is what we use for terminations right now, but not for origins)

    def origin_output(self, dimensions=1):
        return self.data

(I like decorators, although I haven't really got into the habit of using them, as the old version of Jython didn't support them)

The core question is whether people should be explicitly declaring the dimensionality of origins, as opposed to having the system figure out for itself what the dimensionality is based on the length of the returned array.

My feeling right now is that it's more consistent to manually specify the dimensionality (using either of the above approaches), and that would also get rid of the weird order problem that caused me to post this issue.

Still, I think part of the problem is that we are mixing together two different thing in the __init__ call. One is initialization, and one is resetting. And I'd like for there to be some encouragement to separate the two. For example, I recently wrote this code for a delay node:

class Delay(nef.SimpleNode):
    def __init__(self, name, steps):
        self.data = [0]*steps
        nef.SimpleNode.__init__(self, name)

    def termination_input(self, x):
        self.data = self.data[1:] + [x[0]]

    def origin_output(self):
        return [self.data[0]]    

This code has an unexpected bug in it: what happens on a reset? It's not aware of the reset, so it continues merely along spitting out data from before the reset. Here's what I should have written:

class Delay(nef.SimpleNode):
    def __init__(self, name, steps):
        self.steps=steps
        nef.SimpleNode.__init__(self, name)

    def init(self):
        self.data = [0]*self.steps

    def termination_input(self, x):
        self.data = self.data[1:] + [x[0]]

    def origin_output(self):
        return [self.data[0]]    

Now it actually calls init() when you click reset in interactive mode. Of course, this still requires the bizarre ordering thing.

One thing that comes to mind is to have nef.Network() call something when the node is put into the network (in net.add()) that triggers the origin/termination creation (and calls init()). That'd allow this to work:

class Delay(nef.SimpleNode):
    def __init__(self, name, steps):
        nef.SimpleNode.__init__(self, name)
        self.steps=steps
    ...

But, it'd add in this weird thing where a SimpleNode doesn't actually have origins and terminations untill it's been added into a network. I think I could live with that, but is it worth it to allow this more intuitive ordering? I'm thinking yes at this point....

In the past, I've handled resetting with a reset() function. Then I just had my __init__ call self.reset(), which may not be the clearest, but it didn't repeat anything and made it reset properly. It seemed the straightforward solution, since all nodes have reset called on them, and I didn't know about init at the time.

And as to the intuitive ordering... I'm not sure! I think it's true that I more often than not call the superclass constructor first, but I don't know if that's a hard and fast rule, and I've definitely called it at other points. The ideal case would that it doesn't matter when it's called, but yeah, not always possible.

I try to avoid reset() just because it's a bit Nengo-specific (with its weird randomize parameter), and I have to remember to create a constructor to make sure it gets called at the beginning. I was noticing that I was often writing a constructor that just called reset() and then called the parent constructor. init was meant as a generic thing that does exactly what you described. But I forgot about it completely (and I not only wrote it but also covered it in the documentation), so it's clearly not all that memorable....