Option to use pre-trained embeddings as initializers
mmcenta opened this issue · 2 comments
I don't know if this feature is available yet, but I needed to initialize the node embeddings with some pre-trained vectors and I wasn't able to (and I believe the gensim library supports it).
I will probably implement this feature for my own use. If there is interest, I can send a PR and we can figure it out.
It is not supported in DeepWalk yet. Will be happy to take a look if you send a PR :-)
One thing I am not sure about is the context vectors. Since each word has two vectors (embedding vector + context vector), when you load pre-trained embeddings, how are the context vectors being set? If that is still randomly initialized, then will this make the pre-trained embeddings less powerful?
I don't think I understand your question 😅
What I mean by using pre-trained embeddings is that the context vectors are initialized to a set of pre-defined embeddings given by the user. This means that they need to have the same dimensions!
Perhaps an example will help: I am currently training a model for the task of link prediction on the french web, and I am testing an approach in which I insert text information into the graph and then use the deepwalk embeddings as input for the classifier. One of the ideas is to initialize the node embeddings to text embedding of the webpage corresponding to that node with the same dimension.
As for the implementation, I am working on it right now. I am trying to understand how you deal with walks that don't fit in memory.