/char-rnn-tensorflow

Multi-layer Recurrent Neural Networks (LSTM) for character-level language models in Python using Tensorflow

Primary LanguagePythonMIT LicenseMIT

char-rnn-tensorflow

Multi-layer Recurrent Neural Networks (LSTM) for character-level language models in Python using Tensorflow.

Forked from sherjilozair char-rnn-tensorflow.

  • Added separate preprocessing code
  • Added validation and test splitting
  • Use state as tuple in tensorflow
  • Implement sampling with temperature
  • Add dropout like Zaremba et al.

Inspired from Andrej Karpathy's char-rnn.

Requirements

Basic Usage

To preprocess a text file in utf-8, run python preprocess.py --input_file INPUT_FILE --data_dir DATA_DIR.

To train with default parameters, run python trainlm.py --data_dir DATA_DIR --save_dir SAVE_DIR.

To continue training, run python trainlm.py --init_from SAVE_DIR.

To sample from a checkpointed model, python sample.py --init_from SAVE_DIR.

Roadmap

  • Benchmark performance