theeluwin/pytorch-sgns

Different embeddings for input/output words?

phillynch7 opened this issue · 2 comments

Hey there, great skipgram example, so thank you for that.

I have a question on why you decided to use different embeddings for the "input" words and "output"/"negative" words? See lines below:
https://github.com/theeluwin/pytorch-sgns/blob/master/model.py#L29:L30

I imagine this could give better performance on some problem, but haven't been able to test this myself yet. Thanks for the help!

This implementation is totally based on the very first word2vec paper (https://arxiv.org/abs/1310.4546). Using the same embedding (also known as a siamese modeling) also works but according to the paper, the overall embedding is considered as 2-layered neural network with 1 hidden layer, interpreting W1 as 'input vector' and W2 as 'output vector'.

Ahh, I see that now in the paper. Should have read more carefully, was mostly focused on working out the loss function.

Appreciate your help, thanks!