These are the barebone implementations of vanilla RNN in numpy and LSTM in tensorflow to highlight the process of backpropagation through time [1]. They are then demonstrated for both character-based and word-based text generation. The backpropagation through time (BPTT) of RNN follows the manually derived derivatives while LSTM relies the auto-differentiation capability of tensorflow. The BPTT of RNN can be truncated. However, the BPTT of LSTM cannot be truncated due to the limitation of tensorflow's auto-differentiation in looping control.
Example character-based texts generated by LSTM:
he will best accusal hears: art not? nor doth distriam: he radre sweet true.
leave on me; you shall I think frown.
The network does learn basic spelling and to end a sentence by a dot. The sentences don't make a lot of sense though. A lot more training and experimentation is needed to produce better results. Lesson learned: implementing and training recurrent networks are much trickier than feed-forward networks.
Resources for conceptual explainations:
- http://www.wildml.com/2015/09/recurrent-neural-networks-tutorial-part-1-introduction-to-rnns/
- http://colah.github.io/posts/2015-08-Understanding-LSTMs/
- "Basic RNN.ipynb" => walk-through implementation of vanilla RNN in numpy, user-defined truncated BPTT
- "Basic LSTM.ipynn" => walk-through implementation of LSTM in tensorflow, only full BPTT implemented due to limitation of tensorflow