Understanding model_utils.lua

Question

Understanding model_utils.lua

vivanov879 opened this issue 9 years ago · 9 comments

Is it possible to switch combine_all_parameters() and clone_many_times() in train.lua? Can I extract parameters after making a bunch of clones? If I extract params like that
local params, grad_params = model_utils.combine_all_parameters(unpack(clones.embed), unpack(clones.lstm), unpack(clones.softmax)) , the model fails to reduce loss over iterations.

Answer 1 · 2015-09-23T19:46:51.000Z

Turns out i incorrectly passed arguments to combine_all_parameters

function TableConcat(t1,t2)
    for i=1,#t2 do
        t1[#t1+1] = t2[i]
    end
    return t1
end

 local params, grad_params = model_utils.combine_all_parameters(unpack(TableConcat(TableConcat(clones.embed, clones.lstm), clones.softmax)))

and now it works

Answer 2 · 2015-09-28T20:48:32.000Z

That works because it looks for shared Storage objects, but I'd still suggest (for cleanliness) sharing parameters once first, before making clones.

Also, note that TableConcat will modify t1 in place due to passing references, i.e. clones.embed will contain clones.lstm and clones.softmax appended after it.

Instead, you might want to do

function TableConcat(t1,t2)
    local t3 = {}
    for i=1,#t1 do
        t3[#t3+1] = t1[i]
    end
    for i=1,#t2 do
        t3[#t3+1] = t2[i]
    end
    return t3
end

Answer 3 · 2015-10-05T07:38:01.000Z

Brendan, thanks for the explanation. I am using the the clone_many_times.lua for recursive networks. Basically, i have 100 trees with varying number of nodes from 7 to 130. And in every node there's same neural net classifying of the node either positive or negative sentiment. So if I:

Create 1 neural network m, create 130x100 clones: m_clones=clone_many_times(m, 130*100), and put a clone into each node, so that clones do not repeat at all, and do forward and backward one tree after another, i have no problems, the model learns ok
But if i create only 130 clone, and put a clone into each node, so that clones do not repeat only within a single tree, then forward and backward, then do adagrad, the model doesnt learn well.
Do you think the problem is in my implementation of forward_propagation through a tree, or it will not work anyways?

Answer 4 · 2015-10-05T07:42:36.000Z

Even if i do the following:

while i < num_iterations:
  1. forward propagate through 1 tree
  2. backward propagate through 1 tree
  3. adagrad updates my parameters

Answer 5 · 2015-10-05T07:45:53.000Z

So, maybe there is some way to reset a clone?

Answer 6 · 2015-10-05T09:07:49.000Z

Sounds like you may have a bug... I suggest constructing a random tree and
checking your gradients.

Sent from mobile
On Oct 5, 2015 8:38 AM, "vladimir ivanov" notifications@github.com wrote:

Brendan, thanks for the explanation. I am using the the
clone_many_times.lua for recursive networks. Basically, i have 100 trees
with varying number of nodes from 7 to 130. And in every node there's same
neural net classifying of the node either positive or negative sentiment.
So if I:

Create 1 neural network m, create 130_100 clones:
m_clones=clone_many_times(m, 130_100), and put a clone into each
node, so that clones do not repeat at all, and do forward and backward one
tree after another, i have no problems, the model learns ok

But if i create only 130 clone, and put a clone into each node, so that
clones do not repeat only within a single tree, then forward and backward,
then do adagrad, the model doesnt learn well.
Do you think the problem is in my implementation of forward_propagation
through a tree, or it will not work anyways?

—
Reply to this email directly or view it on GitHub
#4 (comment)
.

Answer 7 · 2015-10-05T09:59:51.000Z

thanks -- will do now

Answer 8 · 2015-10-06T06:21:02.000Z

fixed the bug -- the problem was that i didnt forwardProp through each tree before evaluating the confusion matrix.

Answer 9 · 2015-10-06T07:30:36.000Z

thanks -- now it works: https://github.com/vivanov879/recursive_neural_network