fartashf/vsepp

How to build vocab?

chwlsunny opened this issue · 2 comments

Hello fartashf, your code is very helpful for me, but I am confused when I read the python script of vocab.py. When you construct the vocab of f8k_precomp, you use the train captions and valid captions. But when you construct the vocab of f8k, you use the train captions, valid captions and test captions. Could you explain it?

Are you referring to this function?

vsepp/vocab.py

Line 55 in 226688a

def from_flickr_json(path):

I think that's a bug. we should not build the vocab using test captions. Thanks for reporting.

Are you referring to this function?

vsepp/vocab.py

Line 55 in 226688a

def from_flickr_json(path):

I think that's a bug. we should not build the vocab using test captions. Thanks for reporting.

Yeah, I am referring to this function. Thanks for your considerate replies.