/py_crepe

Keras implementation of the Crepe character-level convolutional neural net.

Primary LanguagePython

#py_crepe

version 0.1.0

========

This is an re-implementation of the Crepe character-level convolutional neural net model described in this paper. The re-implementation is done using the Python library Keras.

Details


The model implemented here follows the one described in the paper rather closely. This means that the vocabulary is the same, fixed vocabulary described in the paper, as well as using stochastic gradient descent as the optimizer. Running the model for 9 epochs returns results in line with those described in the paper.

It's relatively easy to modify the code to use a different vocabulary, e.g., one learned from the training data, and to use different optimizers such as adam.

Running

TensorFlow backend

simply run python main.py

Theano backend


The usual command to run the model is:

THEANO_FLAGS=mode=FAST_RUN,device=gpu,floatX=float32,lib.cnmem=1.0 python main.py

You can also set these parameters in a ~/.theanorc file so you don't have to run them each time. The most important parameters, other than running it on a GPU, is the CNMeM allocation variable, which allows for much faster runtimes. When combined with cuDNN there's a rather drastic speedup when compared to the original Lua/Torch implementation.