有关张量维度的问题

Question

有关张量维度的问题

Friest-a11y opened this issue 3 years ago · 9 comments

作者您好，经过一番调试，我出现了以下的问题，经过检查，有问题的行应该是
res_inter = linear_3d(interval, 1, hidden_size, stddev, "inter_linear_trans", active=None) # [batch_size, time_step, hidden_size]
这里导致张量维度不匹配

Traceback (most recent call last):
File "D:\python\lib\site-packages\tensorflow_core\python\framework\ops.py", line 1607, in _create_c_op
c_op = c_api.TF_FinishOperation(op_desc)
tensorflow.python.framework.errors_impl.InvalidArgumentError: Dimensions must be equal, but are 64 and 1 for 'multi_attention/inter_linear_trans/MatMul' (op: 'BatchMatMulV2') with input shapes: [?,?,64], [?,1,250].

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
File "C:/Users/Administrator/Desktop/session-based-news-recommendation-master改动版/main.py", line 129, in
main(args)
File "C:/Users/Administrator/Desktop/session-based-news-recommendation-master改动版/main.py", line 67, in main
model = getattr(modelname, "Seq2SeqAttNN")(args)
File "C:\Users\Administrator\Desktop\session-based-news-recommendation-master改动版\model_combine.py", line 116, in init
scope="multi_attention", hidden_size=self.hidden_size, stddev=self.stddev)
File "C:\Users\Administrator\Desktop\session-based-news-recommendation-master改动版\modules.py", line 132, in multi_attention_layer
stddev) # [batch_size, time_step]
File "C:\Users\Administrator\Desktop\session-based-news-recommendation-master改动版\modules.py", line 151, in count_alpha_m
active=None) # [batch_size, time_step, hidden_size]
File "C:\Users\Administrator\Desktop\session-based-news-recommendation-master改动版\modules.py", line 76, in linear_3d
res = tf.matmul(inputs, w)
File "D:\python\lib\site-packages\tensorflow_core\python\util\dispatch.py", line 180, in wrapper
return target(*args, **kwargs)
File "D:\python\lib\site-packages\tensorflow_core\python\ops\math_ops.py", line 2716, in matmul
return batch_mat_mul_fn(a, b, adj_x=adjoint_a, adj_y=adjoint_b, name=name)
File "D:\python\lib\site-packages\tensorflow_core\python\ops\gen_math_ops.py", line 1712, in batch_mat_mul_v2
"BatchMatMulV2", x=x, y=y, adj_x=adj_x, adj_y=adj_y, name=name)
File "D:\python\lib\site-packages\tensorflow_core\python\framework\op_def_library.py", line 794, in _apply_op_helper
op_def=op_def)
File "D:\python\lib\site-packages\tensorflow_core\python\util\deprecation.py", line 507, in new_func
return func(*args, **kwargs)
File "D:\python\lib\site-packages\tensorflow_core\python\framework\ops.py", line 3357, in create_op
attrs, op_def, compute_device)
File "D:\python\lib\site-packages\tensorflow_core\python\framework\ops.py", line 3426, in _create_op_internal
op_def=op_def)
File "D:\python\lib\site-packages\tensorflow_core\python\framework\ops.py", line 1770, in init
control_input_ops)
File "D:\python\lib\site-packages\tensorflow_core\python\framework\ops.py", line 1610, in _create_c_op
raise ValueError(str(e))
ValueError: Dimensions must be equal, but are 64 and 1 for 'multi_attention/inter_linear_trans/MatMul' (op: 'BatchMatMulV2') with input shapes: [?,?,64], [?,1,250].

Answer 1 · 2022-06-05T14:49:54.000Z

Thanks for your attention. It is a bug when I reorganize my code. After check, the line res_inter = linear_3d(interval, 1, hidden_size, stddev, "inter_linear_trans", active=None) # [batch_size, time_step, hidden_size] should be res_inter = linear_3d(interval, edim3, hidden_size, stddev, "inter_linear_trans", active=None) # [batch_size, time_step, hidden_size].

I will update this part.

Answer 2 · 2022-06-05T14:50:16.000Z

已收到！

Answer 3 · 2022-06-05T16:34:02.000Z

感谢感谢，十分感谢您的回复，顺带请问一下模型文件的日志在哪呢，我找了很多地方并没有找到，是直接在终端打印了嘛

…

---原始邮件--- 发件人: ***@***.***> 发送时间: 2022年6月5日(周日) 晚上10:50 收件人: ***@***.***>; 抄送: "Zhouquan ***@***.******@***.***>; 主题: Re: [summmeer/session-based-news-recommendation] 有关张量维度的问题 (Issue #2) Thanks for your attention. It is a bug when I reorganize my code. After check, the line res_inter = linear_3d(interval, 1, hidden_size, stddev, "inter_linear_trans", active=None) # [batch_size, time_step, hidden_size] should be res_inter = linear_3d(interval, edim3, hidden_size, stddev, "inter_linear_trans", active=None) # [batch_size, time_step, hidden_size]. I will update this part. — Reply to this email directly, view it on GitHub, or unsubscribe. You are receiving this because you authored the thread.Message ID: ***@***.***>

Answer 4 · 2022-06-06T03:08:17.000Z

Yes, I didn't explicitly deal with the log info, but you can use nohup command to run code and save the log info to certain files.

Answer 5 · 2022-06-06T15:21:22.000Z

万分感谢您能在百忙之中回复我！关于论文我又有了一个新的问题此处第二项有关于负例与正例的相似度。其中xc_j是融入负例阅读开始时间（start time）的的向量，但是负例为什么会有阅读开始时间呢，它不是并没有被点击嘛？（由于本人代码能力有限，不太能看懂具体代码中如何处理，如果您有时间，能否帮忙解答一下我的疑问，感激不尽）还是说这个xc_j的计算其实是以出版时间（publish time）来代替（start time）的。 ------------------ 原始邮件 ------------------ 发件人: "summmeer/session-based-news-recommendation" ***@***.***>; 发送时间: 2022年6月6日(星期一) 中午11:08 ***@***.***>; ***@***.******@***.***>; 主题: Re: [summmeer/session-based-news-recommendation] 有关张量维度的问题 (Issue #2) Yes, I didn't explicitly deal with the log info, but you can use nohup command to run code and save the log info to certain files. — Reply to this email directly, view it on GitHub, or unsubscribe. You are receiving this because you authored the thread.Message ID: ***@***.***>

Answer 6 · 2022-06-07T02:26:22.000Z

Maybe there should be a figure? but I guess you're talking about the loss function. In the paper, xc_j is the encoded articles, which is purely encoded using its textual information (i.e. titles of news articles), so your problem about the start time of negative samples is not exist. Only the session vector xc_s is encoded with the start time, and this vector is extracted from a sequence of articles (i.e. xc_i).

Answer 7 · 2022-06-09T02:11:18.000Z

教授您好，实在是不好意思，我的问题并没有描述清楚。首先，最终的损失函数是下图我的问题主要是第二项，第二项可以理解成被点击过的文章与没有被点击过的文章的相似度的一个惩罚项。而xc_s的表达式有下图11式其中的α_i^{t'}来自于α_i^{t}，而α_i^{t}是与ts_i有关的，也就是start time，新闻开始阅读的时间。那么问题来了，点击过的新闻才有start time，对于损失函数中的xc_j，j是指的负例，是推送了而未被点击的新闻，所以我想问一下为了计算负例j的xc_j，请问是他的start time ts_j是如何得来的呢？十分感谢您能在百忙之中回复我，十分感谢！

…

------------------ 原始邮件 ------------------ 发件人: "summmeer/session-based-news-recommendation" ***@***.***>; 发送时间: 2022年6月7日(星期二) 上午10:26 ***@***.***>; ***@***.******@***.***>; 主题: Re: [summmeer/session-based-news-recommendation] 有关张量维度的问题 (Issue #2) Maybe there should be a figure? but I guess you're talking about the loss function. In the paper, xc_j is the encoded articles, which is purely encoded using its textual information (i.e. titles of news articles), so your problem about the start time of negative samples is not exist. Only the session vector xc_s is encoded with the start time, and this vector is extracted from a sequence of articles (i.e. xc_i). — Reply to this email directly, view it on GitHub, or unsubscribe. You are receiving this because you authored the thread.Message ID: ***@***.***>

Answer 8 · 2022-06-09T02:13:16.000Z

教授您好，我再补充一下图此处提到xc_j是负例，所以才会有这个疑问，负例的ts_j是怎么来的呢

…

------------------ 原始邮件 ------------------ 发件人: "翩若惊鸿影" ***@***.***>; 发送时间: 2022年6月9日(星期四) 上午10:11 ***@***.***>; 主题: 回复： [summmeer/session-based-news-recommendation] 有关张量维度的问题 (Issue #2) 教授您好，实在是不好意思，我的问题并没有描述清楚。首先，最终的损失函数是下图我的问题主要是第二项，第二项可以理解成被点击过的文章与没有被点击过的文章的相似度的一个惩罚项。而xc_s的表达式有下图11式其中的α_i^{t'}来自于α_i^{t}，而α_i^{t}是与ts_i有关的，也就是start time，新闻开始阅读的时间。那么问题来了，点击过的新闻才有start time，对于损失函数中的xc_j，j是指的负例，是推送了而未被点击的新闻，所以我想问一下为了计算负例j的xc_j，请问是他的start time ts_j是如何得来的呢？十分感谢您能在百忙之中回复我，十分感谢！

------------------ 原始邮件 ------------------ 发件人: "summmeer/session-based-news-recommendation" ***@***.***>; 发送时间: 2022年6月7日(星期二) 上午10:26 ***@***.***>; ***@***.******@***.***>; 主题: Re: [summmeer/session-based-news-recommendation] 有关张量维度的问题 (Issue #2) Maybe there should be a figure? but I guess you're talking about the loss function. In the paper, xc_j is the encoded articles, which is purely encoded using its textual information (i.e. titles of news articles), so your problem about the start time of negative samples is not exist. Only the session vector xc_s is encoded with the start time, and this vector is extracted from a sequence of articles (i.e. xc_i). — Reply to this email directly, view it on GitHub, or unsubscribe. You are receiving this because you authored the thread.Message ID: ***@***.***>

Answer 9 · 2022-06-09T11:27:02.000Z

Yes xc_j is the negative samples, but only contains the content. xc_s is the vector to represent the whole session, not one certain articles, so there is no problem to get xc_s from start time. Please note that xc_j is different from xc_s.