Is dot product the right way to predict?
JoaoLages opened this issue · 2 comments
While training implicit sequence models, we use losses like hinge, bpr and pointwise. These losses don't maximize directly the dot product, so why do we use it while predicting?
These losses maximize the difference between the dot products of the positive and (implicit) negative items, and so using the dot product for prediction is appropriate.
@JoaoLages A bit late to the party but what we are really optimizing here are the embeddings for user or items whatever. The dot product is a mere operation to combine the two embeddings into one result. The backprop basically goes through the dot product and changes the embeddings in such a way that we get the results we want i.e. maximize the end result for positive items and minimize the result for the negative items. Correct me if I am wrong @maciejkula