About the gating mechanism
Punchwes opened this issue · 5 comments
Hi @svjan5 ,
After reading your source code, I have found some places that make me confused. In the paper, the formula you mentioned is like:
And where:
while in your code, it is like this:
with tf.name_scope("in_arcs-%s_name-%s_layer-%d" % (lbl, name, layer)):
inp_in = tf.tensordot(gcn_in, w_in, axes=[2,0]) + tf.expand_dims(b_in, axis=0)
adj_matrix = tf.transpose(adj_mat[lbl], [0,2,1])
in_t = self.aggregate(inp_in, adj_matrix)
if self.p.dropout != 1.0: in_t = tf.nn.dropout(in_t, keep_prob=self.p.dropout)
if w_gating:
inp_gin = tf.tensordot(gcn_in, tf.sigmoid(w_gin), axes=[2,0]) + tf.expand_dims(b_gin, axis=0)
in_act = self.aggregate(inp_gin, adj_matrix)
else:
in_act = in_t
It seems to me that the calculated in_t
or inp_in
is never used when enable gating which might not align with the formula where there is a multiplication in between. And the weight w_in
and w_out
would never be updated in the code. May you please give me some information how the calculated in_t
or the w_in
and w_out
are used under gating mechanism in your code?
Many thanks.
Hi @Punchwes,
Thanks for pointing it out. It seems like made some error while simplifying the code. I am sorry for the trouble. Please check now and let me know whether it is consistent or not.
Thanks
Hi @svjan5 ,
Thanks so much for the update, it seems that some codes are forgot to be modified when making the change:
with tf.name_scope('in_arcs-%s_name-%s_layer-%d' % (lbl, name, layer)):
inp_in = tf.tensordot(gcn_in, w_in, axes=[2,0]) + tf.expand_dims(b_in, axis=0)
adj_matrix = tf.transpose(adj_mat[lbl], [0,2,1])
if self.p.dropout != 1.0:
inp_in = tf.nn.dropout(inp_in, keep_prob=self.p.dropout)
if w_gating:
inp_gin = tf.tensordot(gcn_in, w_gin, axes=[2,0]) + tf.expand_dims(b_gin, axis=0)
inp_in = inp_in * tf.sigmoid(inp_gin)
in_act = self.aggregate(inp_in, adj_matrix)
else:
in_act = in_t
In the else section: the in_t
is not defined yet, in_t = self.aggregate(inp_in, adj_matrix)
seems to be deleted. And at the out_arcs block:
with tf.name_scope('out_arcs-%s_name-%s_layer-%d' % (lbl, name, layer)):
inp_out = tf.tensordot(gcn_in, w_out, axes=[2,0]) + tf.expand_dims(b_out, axis=0)
adj_matrix = adj_mat[lbl]
if self.p.dropout != 1.0:
inp_out = tf.nn.dropout(inp_out, keep_prob=self.p.dropout)
if w_gating:
inp_gout = tf.tensordot(gcn_in, w_gout, axes=[2,0]) + tf.expand_dims(b_gout, axis=0)
inp_out = inp_out * tf.sigmoid(inp_gout)
out_act = self.aggregate(inp_gout, adj_matrix)
else:
out_act = out_t
The out_act = self.aggregate(inp_gout, adj_matrix)
seems need to be replaced by out_act = self.aggregate(inp_out, adj_matrix)
like in the in_block. The out block also lacks the definition of out_t.
Best,
Qiwei
Hi @Punchwes,
It would be great if you could create a pull request with the changes you have prescribed.
Thanks in advance.
Hi @svjan5 ,
I have created a pull request with changes I mentioned above. Please have a check.
Many thanks
Thanks for your help!