internal error in __sub: no metatable
limadm opened this issue · 0 comments
Hello!
I was trying to run pix2pix with torch-cl, but I found a subtle bug when using nngraph
.
nngraph
helps to build a complex neural network graph, overriding the nn.Module.__unm
metamethod to convert nn
operations to graph nodes, and the nn.Module.__sub
/graph.Node.__sub
metamethods as syntactic sugar to link the graph nodes, for example we can define this dummy two-layer graph:
input1, input2 = -nn.Identity(), -nn.Identity() -- creates identity input nodes
output = {input1, input2} - nn.JoinTable(2) -- joins the two inputs in a single tensor
So we end with this:
>-- input1 --\
>-- output -->
>-- input2 --/
It works fine with torch
, but torch-cl
gives:
~/torch-cl/install/bin/luajit: ./models.lua:69: internal error in __sub: no metatable
stack traceback:
[C]: in function '__sub'
./models.lua:69: in function 'defineG_unet'
train.lua:110: in function 'defineG'
train.lua:146: in main chunk
[C]: in function 'dofile'
...i/torch-cl/install/lib/luarocks/rocks/trepl/scm-1/bin/th:145: in main chunk
[C]: at 0x010e607ce0
I found this error message in pkg/torch/lib/luaT/luaT.c
, where the metamethods are defined with MT_DECLARE_OPERATOR
.
torch
uses two macros, one for unary metamethods (MT_DECLARE_OPERATOR
) and the other for binary metamethods (MT_DECLARE_BIN_OPERATOR
), so when checking a binary metamethod it looks for the metatables of both parameters.
In the above example, even though {input1,input2}
has no metatable, MT_DECLARE_BIN_OPERATOR
finds one in -nn.JoinTable(2)
and makes the call.
torch-cl
's luaT.c uses just the unary version, so the {x,x} - nn.Module()
will only look for the metatable of the first parameter (the bare table {x,x}
) and fail with the internal error in __sub: no metatable
.
I think that this could be solved by copying the torch
approach, use separate macros for unary and binary metamethods. I can do this change if needed.
Thanks!