This is the implementation of A Deep Generative Framework for Paraphrase Generation by Ankush et al. (AAA2018) with Kim's Character-Aware Neural Language Models embedding for tokens. The code used the Samuel Bowman's Generating Sentences from a Continuous Space implementation as a base code available here.
Before model training it is necessary to train word embeddings for both questions and its paraphrases:
$ python train_word_embeddings.py --num-iterations 1200000
$ python train_word_embeddings_2.py --num-iterations 1200000
This script train word embeddings defined in Mikolov et al. Distributed Representations of Words and Phrases
--use-cuda
--num-iterations
--batch-size
--num-sample
–– number of sampled from noise tokens
$ python train.py --num-iterations 140000
--use-cuda
--num-iterations
--batch-size
--learning-rate
--dropout
–– probability of units to be zeroed in decoder input
--use-trained
–– use trained before model
$ python test.py
--use-cuda
--num-sample