summmeer/session-based-news-recommendation

有关张量维度的问题

Friest-a11y opened this issue · 9 comments

作者您好,经过一番调试,我出现了以下的问题,经过检查,有问题的行应该是
res_inter = linear_3d(interval, 1, hidden_size, stddev, "inter_linear_trans", active=None) # [batch_size, time_step, hidden_size]
这里导致张量维度不匹配

Traceback (most recent call last):
File "D:\python\lib\site-packages\tensorflow_core\python\framework\ops.py", line 1607, in _create_c_op
c_op = c_api.TF_FinishOperation(op_desc)
tensorflow.python.framework.errors_impl.InvalidArgumentError: Dimensions must be equal, but are 64 and 1 for 'multi_attention/inter_linear_trans/MatMul' (op: 'BatchMatMulV2') with input shapes: [?,?,64], [?,1,250].

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
File "C:/Users/Administrator/Desktop/session-based-news-recommendation-master改动版/main.py", line 129, in
main(args)
File "C:/Users/Administrator/Desktop/session-based-news-recommendation-master改动版/main.py", line 67, in main
model = getattr(modelname, "Seq2SeqAttNN")(args)
File "C:\Users\Administrator\Desktop\session-based-news-recommendation-master改动版\model_combine.py", line 116, in init
scope="multi_attention", hidden_size=self.hidden_size, stddev=self.stddev)
File "C:\Users\Administrator\Desktop\session-based-news-recommendation-master改动版\modules.py", line 132, in multi_attention_layer
stddev) # [batch_size, time_step]
File "C:\Users\Administrator\Desktop\session-based-news-recommendation-master改动版\modules.py", line 151, in count_alpha_m
active=None) # [batch_size, time_step, hidden_size]
File "C:\Users\Administrator\Desktop\session-based-news-recommendation-master改动版\modules.py", line 76, in linear_3d
res = tf.matmul(inputs, w)
File "D:\python\lib\site-packages\tensorflow_core\python\util\dispatch.py", line 180, in wrapper
return target(*args, **kwargs)
File "D:\python\lib\site-packages\tensorflow_core\python\ops\math_ops.py", line 2716, in matmul
return batch_mat_mul_fn(a, b, adj_x=adjoint_a, adj_y=adjoint_b, name=name)
File "D:\python\lib\site-packages\tensorflow_core\python\ops\gen_math_ops.py", line 1712, in batch_mat_mul_v2
"BatchMatMulV2", x=x, y=y, adj_x=adj_x, adj_y=adj_y, name=name)
File "D:\python\lib\site-packages\tensorflow_core\python\framework\op_def_library.py", line 794, in _apply_op_helper
op_def=op_def)
File "D:\python\lib\site-packages\tensorflow_core\python\util\deprecation.py", line 507, in new_func
return func(*args, **kwargs)
File "D:\python\lib\site-packages\tensorflow_core\python\framework\ops.py", line 3357, in create_op
attrs, op_def, compute_device)
File "D:\python\lib\site-packages\tensorflow_core\python\framework\ops.py", line 3426, in _create_op_internal
op_def=op_def)
File "D:\python\lib\site-packages\tensorflow_core\python\framework\ops.py", line 1770, in init
control_input_ops)
File "D:\python\lib\site-packages\tensorflow_core\python\framework\ops.py", line 1610, in _create_c_op
raise ValueError(str(e))
ValueError: Dimensions must be equal, but are 64 and 1 for 'multi_attention/inter_linear_trans/MatMul' (op: 'BatchMatMulV2') with input shapes: [?,?,64], [?,1,250].

Thanks for your attention. It is a bug when I reorganize my code. After check, the line res_inter = linear_3d(interval, 1, hidden_size, stddev, "inter_linear_trans", active=None) # [batch_size, time_step, hidden_size] should be res_inter = linear_3d(interval, edim3, hidden_size, stddev, "inter_linear_trans", active=None) # [batch_size, time_step, hidden_size].

I will update this part.

Yes, I didn't explicitly deal with the log info, but you can use nohup command to run code and save the log info to certain files.

Maybe there should be a figure? but I guess you're talking about the loss function. In the paper, xc_j is the encoded articles, which is purely encoded using its textual information (i.e. titles of news articles), so your problem about the start time of negative samples is not exist. Only the session vector xc_s is encoded with the start time, and this vector is extracted from a sequence of articles (i.e. xc_i).

Yes xc_j is the negative samples, but only contains the content. xc_s is the vector to represent the whole session, not one certain articles, so there is no problem to get xc_s from start time. Please note that xc_j is different from xc_s.