Natural Language Generation

Text completion, next word prediction systems have made rapid developments in the last two years. Until recently probabilistic word prediction (Markov text generation) was used in such systems. With the introduction of RNNs and transformers the quality of text generated has improved a lot. In this project we highlight usage of autocompletion based on Markov chain models and RNNs (LSTMs)

The probabilistic text generator and text prediction model is implemented in this notebook. The LSTM based model is implemented in this notebook.

To run these notebooks, the following libraries must be installed:

Two of the datasets used for training models, Gutenberg corpus, brown corpus are available in the nltk library. The third movie plot corpus is stored as a pickle file in the repo.