oxford-cs-ml-2015/practical6

Understanding model_utils.lua

vivanov879 opened this issue · 9 comments

Is it possible to switch combine_all_parameters() and clone_many_times() in train.lua? Can I extract parameters after making a bunch of clones? If I extract params like that
local params, grad_params = model_utils.combine_all_parameters(unpack(clones.embed), unpack(clones.lstm), unpack(clones.softmax)) , the model fails to reduce loss over iterations.

Turns out i incorrectly passed arguments to combine_all_parameters

function TableConcat(t1,t2)
    for i=1,#t2 do
        t1[#t1+1] = t2[i]
    end
    return t1
end

 local params, grad_params = model_utils.combine_all_parameters(unpack(TableConcat(TableConcat(clones.embed, clones.lstm), clones.softmax))) 

and now it works

That works because it looks for shared Storage objects, but I'd still suggest (for cleanliness) sharing parameters once first, before making clones.

Also, note that TableConcat will modify t1 in place due to passing references, i.e. clones.embed will contain clones.lstm and clones.softmax appended after it.

Instead, you might want to do

function TableConcat(t1,t2)
    local t3 = {}
    for i=1,#t1 do
        t3[#t3+1] = t1[i]
    end
    for i=1,#t2 do
        t3[#t3+1] = t2[i]
    end
    return t3
end

Brendan, thanks for the explanation. I am using the the clone_many_times.lua for recursive networks. Basically, i have 100 trees with varying number of nodes from 7 to 130. And in every node there's same neural net classifying of the node either positive or negative sentiment. So if I:

  1. Create 1 neural network m, create 130x100 clones: m_clones=clone_many_times(m, 130*100), and put a clone into each node, so that clones do not repeat at all, and do forward and backward one tree after another, i have no problems, the model learns ok
  2. But if i create only 130 clone, and put a clone into each node, so that clones do not repeat only within a single tree, then forward and backward, then do adagrad, the model doesnt learn well.
    Do you think the problem is in my implementation of forward_propagation through a tree, or it will not work anyways?

Even if i do the following:

while i < num_iterations:
  1. forward propagate through 1 tree
  2. backward propagate through 1 tree
  3. adagrad updates my parameters

So, maybe there is some way to reset a clone?

Sounds like you may have a bug... I suggest constructing a random tree and
checking your gradients.

Sent from mobile
On Oct 5, 2015 8:38 AM, "vladimir ivanov" notifications@github.com wrote:

Brendan, thanks for the explanation. I am using the the
clone_many_times.lua for recursive networks. Basically, i have 100 trees
with varying number of nodes from 7 to 130. And in every node there's same
neural net classifying of the node either positive or negative sentiment.
So if I:

  1. Create 1 neural network m, create 130_100 clones:
    m_clones=clone_many_times(m, 130_100), and put a clone into each
    node, so that clones do not repeat at all, and do forward and backward one
    tree after another, i have no problems, the model learns ok
  2. But if i create only 130 clone, and put a clone into each node, so that
    clones do not repeat only within a single tree, then forward and backward,
    then do adagrad, the model doesnt learn well.
    Do you think the problem is in my implementation of forward_propagation
    through a tree, or it will not work anyways?


Reply to this email directly or view it on GitHub
#4 (comment)
.

thanks -- will do now

fixed the bug -- the problem was that i didnt forwardProp through each tree before evaluating the confusion matrix.