queries about the nuswide_wordvec text file
zhangzeng97 opened this issue · 10 comments
Hi Cao Yue,
Great thanks for your great job done on the DVSQ project. I am currently working on my project at school. It has helped me a lot.
I have successfully deployed the whole project. However, when I tried to run it with my own dataset, confusions arise. May I ask where the wordvec file in the data folder comes from? I have read your paper about the transformer to convert the image representations to embedding labels. However, it does not seem relevant to this file. May I ask how I can generate the word vectors and should which dataset be converted to word vectors.
Thank you.
Best,
Zhang Zeng
Hi Cao Yue,
Great thanks for your fast reply!
I have looked into that and it helps a lot.
Best,
Zhang Zeng
Hi,
May I ask something about the paper itself here?
I have read through it several times, but there are some points that I cannot understand. Like the word embedding for the labels, may I ask why do we need this? I tried to print the output of validation but it is the 81-dimensional label instead of the 300-dimensional word embeddings.
Thanks a lot:)
Best,
Zhang Zeng
Hi Zeng,
You are right, the label itself is 81-dimensional because nuswide is a 81-class dataset, and the word embedding of a single label is 300-dimensional.
Actually, because Nuswide is a multi-label dataset, the label representation of an image is a matrix of 81 * 300 dimensional(not just a vector of 81 or 300 dimension). Specifically, the ith row is the word embedding of label i if the image has label i, otherwise, the ith row will be all zero.(You can prove this by looking at the line 322 of file "net.py").
Hi Bin,
Thank you so much for your fast reply!
I have gone through it again. May I ask what the codebook C mentioned in the section 3.2 of the paper? My understanding is that for 81 classes, each class contains K centers. And if the C here is the same C in the line 68 of the net_val.py file?
I tried to print out the self.C from the model. It shows that it is a 1024 x 300 tensor. In my opinion, the 300 represents the 300-dimensional vectors while I am not so sure where the 1024 comes from.
Best,
Zeng
1024 = n_subcenter(256) * n_subspace(4).
Sorry for my late reply.
I have got the GoogleNews-vectors-negative300.bin and I wonder how to get the word2vec.txt in cifar10 dataset
You can use gensim to load the model and extract wordvector. Here is a tutorial.
import gensim
model = gensim.models.KeyedVectors.load_word2vec_format('GoogleNews-vectors-negative300.bin', binary=True)
print(model['car'])
Thanks for your help. However, I just try model['airplane', ...](include the 10 class of cifar10) and get the .txt which is wrong. I hope to know how to get the correct wordvector.