rpryzant/delete_retrieve_generate

Some confusion

tuhinjubcse opened this issue · 2 comments

python tools/make_vocab.py [entire corpus file (src + tgt cat'd)] [vocab size] > vocab.txt
python tools/make_attribute_vocab.py vocab.txt [corpus src file] [corpus tgt file] [salience ratio] > attribute_vocab.txt
python tools/make_ngram_attribute_vocab.py vocab.txt [corpus src file] [corpus tgt file] [salience ratio] > attribute_vocab.txt

I was wondering if the third statement is

python tools/make_ngram_attribute_vocab.py attribute_vocab.txt [corpus src file] [corpus tgt file] [salience ratio] > attribute_ngram_vocab.txt

Thanks for reaching out! Both make_attribute_vocab.py and make_ngram_attribute_vocab.py take the original vocab.txt file as input.

The two scripts have a lot of shared logic and ought be merged at some point...pull requests welcome!

Hope that was helpful.

Yes thanks it was