LightGCL

This is the PyTorch implementation for LightGCL proposed in the paper LightGCL: Simple Yet Effective Graph Contrastive Learning for Recommendation, International Conference on Learning Representation, 2023.

Fig: Model Structure of LightGCL

1. Note on datasets and directories

Due to the large size of datasets ML-10M, Amazon and Tmall, we have compressed them into zip files. Please unzip them before running the model on these datasets. For Yelp and Gowalla, keeping the current directory structure is fine.

Before running the codes, please ensure that two directories log/ and saved_model/ are created under the root directory. They are used to store the training results and the saved model and optimizer states.

2. Running environment

We develope our codes in the following environment:

Python version 3.9.12
torch==1.12.0+cu113
numpy==1.21.5
tqdm==4.64.0

3. How to run the codes

Yelp

python main.py --data yelp

Gowalla

python main.py --data gowalla --lambda2 0

ML-10M

python main.py --data ml10m --temp 0.5

Tmall

python main.py --data tmall --gnn_layer 1

Amazon

python main.py --data amazon --gnn_layer 1 --lambda2 0 --temp 0.1

4. Some configurable arguments

--cuda specifies which GPU to run on if there are more than one.
--data selects the dataset to use.
--lambda1 specifies $\lambda_1$, the regularization weight for CL loss.
--lambda2 is $\lambda_2$, the L2 regularization weight.
--temp specifies $\tau$, the temperature in CL loss.
--dropout is the edge dropout rate.
--q decides the rank q for SVD.

5. On the complexity of LightGCL

We notice that many readers are confused about the complexity of performing graph convolution on the SVD-reconstructed view, arguing that the complexity should be O(2IJLd) since the SVD-reconstructed view is fully-connected. In fact, this issue has been clearly explained in the Appendix D.3 in our paper. We also answered a Github issue about it (issue #3). We hereby clarify again:

It is correct that the reconstructed graph is fully connected. However, please note that the reconstructed graph is actually the product of three low dimension matrices U,S,V', whose dimensions are I×q, q×q, q×J, respectively (where q is as small as 5). So we don't really need to compute the reconstructed graph, but just store the three low-dimension matrices. And by doing the matrix multiplication in the following order: US [pre-calculated, complexity not counted into training], V'E [complexity is O(qJd)], and then (US)(V'E) [complexity is O(qId)], we never need to construct that large matrix, and the complexity is proportional to (I+J) instead of (IJ).

6. Citing our paper

Please kindly cite our paper if you find this paper and the codes helpful.

@inproceedings{caisimple,
  title={LightGCL: Simple Yet Effective Graph Contrastive Learning for Recommendation},
  author={Cai, Xuheng and Huang, Chao and Xia, Lianghao and Ren, Xubin},
  booktitle={The Eleventh International Conference on Learning Representations},
  year={2023}
}