Recurrent Neural Network Language Model using TensorFlow
Content
- Original Work
- Motivations
- Quickstart
- Continue Training
- Getting a text perplexity with regard to the LM
- Line by line loglikes
- Results on PTB Dataset
Original Work
Our work is based on the RNN LM tutorial on tensorflow.org following the paper from Zaremba et al., 2014.
The tutorial uses the PTB dataset (tgz). We intend to work with various dataset, that's why we made names more generic by removing several "PTB" prefix initially present in the code.
Original sources: TensorFlow v0.11 - PTB
See also:
- Nelken's "tf" repo inspired our work by the way it implements features we are interested in.
- Benoit Favre's tf rnn lm
## Motivations
- Getting started with TensorFlow
- Make RNN LM manipulation easy in practice (easily look/edit configs, cancel/resume training, multiple outputs...)
- Train RNN LM for ASR using Kaldi (especially using
loglikes
mode)
Quickstart
git clone https://github.com/pltrdy/tf_rnnlm
cd tf_rnnlm
./train.py --help
Downloading PTB dataset:
chmod +x tools/get_ptb.sh
./tools/get_ptb.sh
Training small model:
mkdir small_model
./train.py --data_path ./simple-examples/data --model_dir=./small_model --config small
Training custom model:
mkdir custom_model
# Generating new config file
chmod +x gen_config.py
./gen_config.py small custom_model/config
# Edit whatever you want
vi custom_model/config
# Train it. (it will automatically look for 'config' file in the model directory as no --config is set).
# It will look for 'train.txt', 'test.txt' and 'valid.txt' in --data_path
# These files must be present.
./train.py --data_path=./simple-examples/data --model_dir=./custom_model
note: data files are expected to be called train.txt
, test.txt
and valid.txt
. Note that get_ptb.sh
creates symlinks for that purpose
Training all models and reporting results
./run.sh
Yes, that's all. It will train small
, medium
and large
then generate a report like this using report.sh.
Feel free to share the report with us!
Continue Training
One can continue an interrupted training with the following command:
./train.py --data_path=./simple-examples/data --model_dir=./model
Where ./model
must contain config
, word_to_id
, checkpoint
and the corresponding .cktp
files.
Getting a text perplexity with regard to the LM
# Compute and outputs the perplexity of ./simple-examples/data/test.txt using LM in ./model
./test.py --data_path=./simple-examples/data --model_dir=./model
Line by line loglikes
Running the model on each stdin
line and returning its 'loglikes' (i.e. -costs/log(10)
).
Note: in particular, it is meant to be used for Kaldi's rescoring.
cat ./data/test.txt | ./loglikes.py --model_dir=./model
Text generation
Not documented yet
Results on PTB dataset
Configuration small
medium
and large
are defined in config.py
and are the same as in tensorflow.models.rnn.ptb.ptb_word_lm.py:200
batch_size=32
Using config | train | valid | test | speed | training_time |
---|---|---|---|---|---|
small | 24.608 | 118.848 | 113.366 | ~49kWPS | 4m17s |
medium | 26.068 | 91.305 | 87.152 | ~25kWPS | 24m50s |
large | 18.245 | 84.603 | 79.515 | ~6kWPS | 135m15s |
small | 27.913 | 123.896 | 119.496 | ~42kWPS | 4m56s |
medium | 28.533 | 98.105 | 94.576 | ~23kWPS | 26m51s |
large | 21.635 | 91.916 | 87.110 | ~6kWPS | 140m675 |
batch_size=64
Using config | train | valid | test | speed | training_time |
---|---|---|---|---|---|
small | 32.202 | 119.802 | 115.209 | ~44kWPS | 4m40s |
medium | 31.591 | 97.219 | 93.450 | ~24kWPS | 25m0s |
large | 18.198 | 88.675 | 83.143 | ~9kWPS | 95m25s |
small | 39.031 | 127.949 | 124.292 | ~94kWPS | 3m9s |
medium | 33.130 | 102.652 | 99.381 | ~29kWPS | 21m7s |
large | 21.122 | 95.310 | 90.658 | ~7kWPS | 112m48s |
kWPS: processing speed, i.e. thousands word per seconds.
Reported time are real
times (see What do 'real', 'user' and 'sys' mean in the output of time(1)?
Testing is done using softmax on transposed weights. (docs/transpose.md)
For faster results increasing batch_size
should speed up the process, with a small perplexity increase as a side effect and an increased GPU Memory consumption. (which can fire Out Of Memory exception)
## Contributing
Please do!
Fork the repo -> edit the code -> commit with descriptive commit message -> open a pull request
You can also open an issue for any discussion about bugs, performance or results.
Please also share your results with us! (see sharing your results)