Understanding model_utils.lua
vivanov879 opened this issue · 9 comments
Is it possible to switch combine_all_parameters()
and clone_many_times()
in train.lua? Can I extract parameters after making a bunch of clones? If I extract params like that
local params, grad_params = model_utils.combine_all_parameters(unpack(clones.embed), unpack(clones.lstm), unpack(clones.softmax))
, the model fails to reduce loss over iterations.
Turns out i incorrectly passed arguments to combine_all_parameters
function TableConcat(t1,t2)
for i=1,#t2 do
t1[#t1+1] = t2[i]
end
return t1
end
local params, grad_params = model_utils.combine_all_parameters(unpack(TableConcat(TableConcat(clones.embed, clones.lstm), clones.softmax)))
and now it works
That works because it looks for shared Storage
objects, but I'd still suggest (for cleanliness) sharing parameters once first, before making clones.
Also, note that TableConcat
will modify t1
in place due to passing references, i.e. clones.embed
will contain clones.lstm
and clones.softmax
appended after it.
Instead, you might want to do
function TableConcat(t1,t2)
local t3 = {}
for i=1,#t1 do
t3[#t3+1] = t1[i]
end
for i=1,#t2 do
t3[#t3+1] = t2[i]
end
return t3
end
Brendan, thanks for the explanation. I am using the the clone_many_times.lua for recursive networks. Basically, i have 100 trees with varying number of nodes from 7 to 130. And in every node there's same neural net classifying of the node either positive or negative sentiment. So if I:
- Create 1 neural network
m
, create 130x100 clones:m_clones=clone_many_times(m, 130*100)
, and put a clone into each node, so that clones do not repeat at all, and do forward and backward one tree after another, i have no problems, the model learns ok - But if i create only 130 clone, and put a clone into each node, so that clones do not repeat only within a single tree, then forward and backward, then do adagrad, the model doesnt learn well.
Do you think the problem is in my implementation of forward_propagation through a tree, or it will not work anyways?
Even if i do the following:
while i < num_iterations:
1. forward propagate through 1 tree
2. backward propagate through 1 tree
3. adagrad updates my parameters
So, maybe there is some way to reset
a clone?
Sounds like you may have a bug... I suggest constructing a random tree and
checking your gradients.
Sent from mobile
On Oct 5, 2015 8:38 AM, "vladimir ivanov" notifications@github.com wrote:
Brendan, thanks for the explanation. I am using the the
clone_many_times.lua for recursive networks. Basically, i have 100 trees
with varying number of nodes from 7 to 130. And in every node there's same
neural net classifying of the node either positive or negative sentiment.
So if I:
- Create 1 neural network m, create 130_100 clones:
m_clones=clone_many_times(m, 130_100)
, and put a clone into each
node, so that clones do not repeat at all, and do forward and backward one
tree after another, i have no problems, the model learns ok- But if i create only 130 clone, and put a clone into each node, so that
clones do not repeat only within a single tree, then forward and backward,
then do adagrad, the model doesnt learn well.
Do you think the problem is in my implementation of forward_propagation
through a tree, or it will not work anyways?—
Reply to this email directly or view it on GitHub
#4 (comment)
.
thanks -- will do now
fixed the bug -- the problem was that i didnt forwardProp through each tree before evaluating the confusion matrix.
thanks -- now it works: https://github.com/vivanov879/recursive_neural_network