Issue with code sample in book from chapter 3 "PRE-TRAINED WORD EMBEDDINGS"
david-sitsky opened this issue · 0 comments
david-sitsky commented
Hi - apologies if this is the wrong place to report this, but I have been reading the online version of this book, and when I try to run the following code sample from chapter 3 with the path to the model updated:
from gensim.models import Word2Vec, KeyedVectors
pretrainedpath = "NLPBookTut/GoogleNews-vectors-negative300.bin"
w2v_model = KeyedVectors.load_word2vec_format(pretrainedpath, binary=True)
print('done loading Word2Vec')
print(len(w2v_model.vocab)) #Number of words in the vocabulary.
print(w2v_model.most_similar['beautiful'])
W2v_model['beautiful']
It fails with the following:
$ python3 word2vec.py
done loading Word2Vec
Traceback (most recent call last):
File "word2vec.py", line 5, in <module>
print(len(w2v_model.vocab)) #Number of words in the vocabulary.
File "/home/sits/.local/lib/python3.8/site-packages/gensim/models/keyedvectors.py", line 645, in vocab
raise AttributeError(
AttributeError: The vocab attribute was removed from KeyedVector in Gensim 4.0.0.
Use KeyedVector's .key_to_index dict, .index_to_key list, and methods .get_vecattr(key, attr) and .set_vecattr(key, attr, new_val) instead.
See https://github.com/RaRe-Technologies/gensim/wiki/Migrating-from-Gensim-3.x-to-4
I can see the code for Ch3 has been changed to take this into account, eg, removing the len() call and using code like:
print(w2v_model.most_similar('beautiful'))
Can the online book be updated with the correct code?