RAM problem while generating adjacency matrix

Question

RAM problem while generating adjacency matrix

kontrabas380 opened this issue 4 years ago · 3 comments

Hello,

I wanted to try this solution for my own dataset that consists around 1 mln users and about 300k items (and about 42 mln interactions for training). Unfortunately, when I've prepared data for my tests and I started script, my process was killed because of exceeding 240Gi of RAM. It happens while generating adjacency matrix in the dataloader in line:
adj_mat[:self.n_users, self.n_users:] = R

Is there possibility to make it differently? Or my dataset is just too big for use with LightGCN?

Best regards

Answer 1 · 2021-07-30T13:16:03.000Z

Hi, for the readability and convenience of the code, we choose to use a big matrix to place both R and R.T.
So right now that's the only way in this implementation.
We're aware of the possible RAW overflow problem, too. So we can communicate about another more memory-efficient way to store the adj matrix as well as keep the functionality of LightGCN.

Answer 2 · 2021-08-28T01:00:02.000Z

You can simply rewrite the loading method.

For now, the repo uses the UserItemNet which is NxM, and assigns the matrix to the bigger matrix's slices, which cost a lot of time and RAM.
You can just read the item's remapped ID as remapped_item_idx + num_of_users, then use this to generate the sparse adjacency matrix directly, which is just a csr_matrix calling.

To do so, modify the reading method here https://github.com/gusye1234/LightGCN-PyTorch/blob/master/code/dataloader.py#L241 ;
Then modify this method: https://github.com/gusye1234/LightGCN-PyTorch/blob/master/code/dataloader.py#L332 ;

And note that, in order to find the num of users before you read the train user-item data (UserItemNet), you may need to read the user_list.txt in advance.

And besides all that, which I have tried, you still need to think about the training consuming for a larger graph, since it will cost a lot more time than the original little graph.

And besides all that again, may I ask one more question to the team, @gusye1234. So the LightGCN is not based on GraphSAGE right? It uses the whole graph information for the propagation in the computer() method, which is more like a GCN style? I thought LightGCN is just simply removing all of the non-linear layers from NGCF, but it seems not? I am a little confused, please correct me if I am wrong.

Answer 3 · 2021-09-24T16:11:54.000Z

Thank @Gongzq5 for the patiently reply.
As for your question, I think you can imagine LightGCN as a reduced NGCF, and that was the exact original idea: if the non-linear transformations are not that useful as we think in NGCF, then why not use a reduced NGCF.
And turns out, the result has a pretty clean and simple message passing mechanism, a GCN's style.