williamleif/histwords

Ask for help

Closed this issue · 1 comments

I'm sorry to disturb you. I'm a newer. I try my best to study the code, but I don't know how to solve it. There may be two errors in the scripts.

1.When I ran the script , there generated 5 files(sgns.contexts.txt, sgns.contexts.bin, sgns.words.txt, sgns.words.bin, sgns.words.words).what's the last file? Why is sgns.words.words blank?Can you offer the source code of the word2vecf?

word2vecf/word2vecf -train w2.sub/pairs -pow 0.75 -cvocab w2.sub/counts.contexts.vocab -wvocab w2.sub/counts.words.vocab -dumpcv w2.sub/sgns.contexts -output w2.sub/sgns.words -threads 10 -negative 15 -size 500;

2.the file of vecanalysis has no representations. If I want to get the alligned embeddings, how can I operate it. What's the perparameters to coney in the scripts of the seq_procrustes.py?

from vecanalysis.representations.representation_factory import create_representation

Thanks for your help.Thanks.@williamleif

Hmm, the way that I use the scripts is to run:

https://github.com/williamleif/histwords/sgns/runword2vec.py

(for word2vecf information see: https://bitbucket.org/omerlevy/hyperwords or use the makecorpus.py script in my package)

and then you can run:

https://github.com/williamleif/histwords/sgns/postprocesssgns.py

To post-process the word2vecf output into the format histwords uses.

Finally, you would run:

https://github.com/williamleif/histwords/seq_procrustes.py

On the output of "postprocesssgns.py".

Hope that is somewhat helpful!