Questions about the construction procedure of item_edges_a and item_edges_b in function build_adj_graph of file run_DCRec.py

Question

Questions about the construction procedure of item_edges_a and item_edges_b in function build_adj_graph of file run_DCRec.py

cizhouyu opened this issue a year ago · 2 comments

Hello, thank you for your open source code. I have a detailed question that I hope to get answered:
In the build_adj_graph function of run_DCRec.py, there are two if branches in the loop body that will add elements to item_edges_a and item_edges_b, which will cause the same item in item_seq to be repeatedly added to item_edges_a and item_edges_b.

I don't quite understand the reason for this construction procedure. I would have thought the if branch was just for bounds checking. So I tried changing the second if to elif and found that it could still train successfully, but the metrics became worse
(test result: {'recall@1': 0.2101, 'recall@5': 0.4058, 'recall@10': 0.5149, 'ndcg@1': 0.2101, 'ndcg@5': 0.3128, 'ndcg@10': 0.3478} before change; and test result: {'hit@1': 0.1793, 'hit@5': 0.3911, 'hit@10': 0.5067, 'ndcg@1': 0.1793, 'ndcg@5': 0.2899, 'ndcg@10': 0.327} after change).

I would be very grateful if you could tell me what item_edges_a and item_edges_b are used for and why such a construction procedure is needed❤️

Answer 1 · 2023-10-03T16:14:01.000Z

Hi,

Thanks for bringing this up. You can just consider item_edges_a as dst_nodes and item_edges_b as src_nodes for constructing a standard edge list. The first 'if' considers left connectivity on the sequence and the second 'if' is for right connectivity.

So if you remove half of the edges (the right neighbors in the sequences), it is expected that a performance drop would happen.

Answer 2 · 2023-10-04T01:20:45.000Z

Okay, I understand. Thanks for your quick response!