negative samples should be different in each train epoch
KylinA1 opened this issue · 4 comments
- negative samples in training data should be different in each training epoch, otherwise it will overfitting.
- when you sampling negative items to get training dataset, the positive item in test dataset should not be included in user's positive dictionary, it also have chance to become a negative item in training data.
- All above you can find that in the official implementation.
For your first question, I should update the add_negative() function inside the training data construction. Further modification will be provided. Thanks for commenting.
For the second question, as in a real scenario, all the items that are not purchased by this user should be in the candidate negative item set. Therefore, I do slightly not agree with the Xiangnan He on this point.
Thank you for your kind notes.
I think we share the same opinion - "
all the items that are not purchased by this user should be in the candidate negative item set", what exactly the official implementation did.
In you implementation, there are two dictionaries - user_bought, which contains test positive one, user_negative where you sampling negative items from.
Actually, the test positive item should be in the user_negative dictionary other than user_bought.
Yeah, that is really a bad issue. I will modify this. Thanks!
I fixed this bug, check the new version.