The goal of this project is to generate text, accordingly to what our system has learned from its training, analyzing the text of certain datasets. Therefore the main idea is to predict the next characters given an input text. An example is presented below:
The architecture built is described by this figure:
- The input used is sequences formed by 40 one-hot encoding characters. There are 59 possible characters.
- An RNN(Recurrent Neural Network) layer to take into account the temporal information of the data.
- A softmax, which for each possible character gives the corresponding probability of being the next.
- The output is chosen by predicting the character with the largest probability.
Different models were tried for this task, which their differences lie in which RNN is implemented:
- One layer LSTM (Long Short-Term Memory) with 128 hidden units.
- One layer GRU (Gated Recurrent Unit) with 128 hidden units.
- One layer PLSTM (Phased LSTM)
- Two layer LSTM with 256 and 128 hidden units respectively.
Here we present the implementation of the one layer LSTM model implemented with Keras:
model = Sequential()
model.add(LSTM(128, input_shape=(maxlen, len(chars))))
model.add(Dense(len(chars)))
model.add(Activation('softmax'))
The datasets used for this purpose are:
- Deep Learning for Speech and Language
- Understanding LSTM
- The Unreasonable Effectiveness of Recurrent Neural Networks
Slides for our project can be found here
Webpage for the project is here