Consider masking out the padding embeddiing at the tail of each session and adding positional embedding.

Question

Consider masking out the padding embeddiing at the tail of each session and adding positional embedding.

Closed this issue 3 years ago · 7 comments

In the original paper the normalization is made with items in one session. While when implementing the algorithm we need to pad some positions to make the length of sessions the same in one batch. So when calculating the l2 norm the items at irrelative positions should be ignored. The author also found adding positional embedding is a little helpful. By the way, I'm looking forward for your number on Star GNN. Now I have reproduced the number for yoochoose1/64 while I can't get the same number on diginetica. Thank you a lot!

Answer 1 · 2021-01-03T14:59:27.000Z

Thanks for the advice.
I will update the code to see if the performance improves.

By the way, could you share the Star GNN code with me?
I am still checking my codes since I can reach the performance claimed on the paper with both datasets.

Answer 2 · 2021-01-03T15:18:20.000Z

Do you mean you have reached the performance claimed in the paper with both datasets? Since that, could you please release the code? Thanks! I will share with you my code once I have organized the code well. I just found layer norm useless and I am trying to increase the number with diginetica.

Answer 3 · 2021-01-03T17:58:07.000Z

Oops! I mean I haven't reached the performance yet. It should be can't.
Sorry for the inconvenience.

Answer 4 · 2021-01-04T10:47:00.000Z

I found fixing the order of training sample and use hidden size 256(introduced in the original paper) with l2 normalization increase the number greatly. Maybe you can try them in your star gnn code.

Answer 5 · 2021-01-07T06:40:01.000Z

Does it seem that the performance gain is from the l2 norm instead of the star graph topology?

Answer 6 · 2021-01-08T09:50:27.000Z

Mainly from larger hidden size and l2 norm(which is introduced in NISER), star only bring a little gain. If you set the hidden size to be 100 like SR-GNN and NISER, the performance will drop a lot.

Answer 7 · 2021-10-02T03:18:55.000Z

Hidden size and l2 norm really boost the performance for many models!