Some confusion
tuhinjubcse opened this issue · 2 comments
tuhinjubcse commented
python tools/make_vocab.py [entire corpus file (src + tgt cat'd)] [vocab size] > vocab.txt
python tools/make_attribute_vocab.py vocab.txt [corpus src file] [corpus tgt file] [salience ratio] > attribute_vocab.txt
python tools/make_ngram_attribute_vocab.py vocab.txt [corpus src file] [corpus tgt file] [salience ratio] > attribute_vocab.txt
I was wondering if the third statement is
python tools/make_ngram_attribute_vocab.py attribute_vocab.txt [corpus src file] [corpus tgt file] [salience ratio] > attribute_ngram_vocab.txt
rpryzant commented
Thanks for reaching out! Both make_attribute_vocab.py
and make_ngram_attribute_vocab.py
take the original vocab.txt
file as input.
The two scripts have a lot of shared logic and ought be merged at some point...pull requests welcome!
Hope that was helpful.
tuhinjubcse commented
Yes thanks it was