negative samples should be different in each train epoch

Question

negative samples should be different in each train epoch

KylinA1 opened this issue 6 years ago · 4 comments

negative samples in training data should be different in each training epoch, otherwise it will overfitting.
when you sampling negative items to get training dataset, the positive item in test dataset should not be included in user's positive dictionary, it also have chance to become a negative item in training data.

All above you can find that in the official implementation.

Answer 1 · 2019-03-01T00:54:59.000Z

For your first question, I should update the add_negative() function inside the training data construction. Further modification will be provided. Thanks for commenting.

For the second question, as in a real scenario, all the items that are not purchased by this user should be in the candidate negative item set. Therefore, I do slightly not agree with the Xiangnan He on this point.

Answer 2 · 2019-03-01T01:06:05.000Z

Thank you for your kind notes.
I think we share the same opinion - "
all the items that are not purchased by this user should be in the candidate negative item set", what exactly the official implementation did.
In you implementation, there are two dictionaries - user_bought, which contains test positive one, user_negative where you sampling negative items from.
Actually, the test positive item should be in the user_negative dictionary other than user_bought.

Answer 3 · 2019-03-01T01:20:28.000Z

Yeah, that is really a bad issue. I will modify this. Thanks!

Answer 4 · 2019-04-08T00:28:01.000Z

I fixed this bug, check the new version.