xiangwang1223/neural_graph_collaborative_filtering

About the norm adj matrix and gcn convolutional layer

yuanyuansiyuan opened this issue · 9 comments

i want to know if i set the adj matrix and the convolutional layer, the implementation is similar to the ICLR 18 GCN implementation,
except the prediction setting, here we use the concatation of each layer, while ICLR 18 uses the last layer representation of GCN?

I just want to know the most similar setting to ICLR 18, because it seems like the ICLR18 implementation can achive better performance than NGCF on my data. Thank you!

Sorry, it is hard for me to understand your description...
Could you please reorganize it? Thanks.

I have implement a simple gcn according to the ICLR 18 gcn version. It has bettern performance on my data than NGCF. So i want to know how to set your code to be similar to the ICLR 18 version, cause you offer different convolutional layer and adj matrix settings. Thank you.

Plus, have you tried not using the concatation of each gcn layer, just the representation from the last layer?

Hi,

  1. First, you need to replace the normalized_adj_single() function with the following normalized_adj_bi() in the load_data.py file. You can change it by yourself, and No Guarantee that this implementation is correct. Note that please remove the previously generated.npz files first.

def normalized_adj_bi(adj):
rowsum = np.array(adj.sum(1))
d_inv_sqrt = np.power(rowsum, -0.5).flatten()
d_inv_sqrt[np.isinf(d_inv_sqrt)] = 0.
d_mat_inv_sqrt = sp.diags(d_inv_sqrt)
bi_adj = adj.dot(d_mat_inv_sqrt).transpose().dot(d_mat_inv_sqrt)
return bi_adj

  1. Then you can only select the output of the last layer as the final representations of users and items; meanwhile, as described in the README file, you can implement GCN via --alg_type gcn.

  2. Compared to GCN, GC-MC uses an additional transformation layer, and can be viewed as the advanced GCN; in my experiments, NGCF achieves better performance than GC-MC. Hence, I didn't select GCN as a baseline.

  3. As for the performance in your dataset, there are several possible reasons based on my experience: i) the hyperparameters of NGCF are not well-tuned or searched. ii) the user-item interactions are not that sparse, or the number of users or items are too small.

Thanks.

Thanks very much for your explanation!
normalized_adj_single() is for the directed graph and the normalized_adj_bi() is a double-directed graph for user-item interactions?

Actually, my data is small, only have 300+ users and 700+ items and the density is 14%. I will tune the parameter more carefully and try to implement the original gcn version.

And the concatenation of each layer's representation is mainly for relieving the spasity problem?

The concatenation can help solve the oversmoothing issue of GNNs, that is, when stacking multiple GNN layers (say > 3), the representations of nodes will tend to be similar; the concatenation can highlight different-order information. For more information, please refer to the paper titled Representation Learning on Graphs with Jumping Knowledge Networks. Thanks.

Thank you!

Have you try NGCF on some datasets which are not so sparse(e.g, ml-100k, sparse=0.05)?
It seems NGCF performance better on extremely sparse dataset.
The reason is that k-order gnn will update node embedding based on its k-order neighbors and k-orders may cover most of nodes in a graph if this graph is not extremely sparse.

bbjy commented

The normalized_adj in your article is normalized_adj_bi, and the normalized_adj in your current code is normalized_adj_single(), is it? Or do I misunderstand? Which type of the normalized adjacency used for experiments? @xiangwang1223