Recurrent Neural Networks - A Short TensorFlow Tutorial
Setup
Clone this repo to your local machine, and add the RNN-Tutorial directory as a system variable to your ~/.profile
. Instructions given for bash shell:
git clone https://github.com/silicon-valley-data-science/RNN-Tutorial
cd RNN-Tutorial
echo "export RNN_TUTORIAL=${PWD}" >> ~/.profile
echo "export PYTHONPATH=${PWD}/src:${PYTHONPATH}" >> ~/.profile
source ~/.profile
Create a Conda environment (You will need to Install Conda first)
conda create --name tf-rnn python=3
source activate tf-rnn
cd $RNN_TUTORIAL
pip install -r requirements.txt
Install TensorFlow
If you have a NVIDIA GPU with CUDA already installed
pip install tensorflow-gpu==1.0.1
If you will be running TensorFlow on CPU only (e.g. a MacBook Pro), use the following command (if you get an error the first time you run this command read below):
pip install --upgrade\
https://storage.googleapis.com/tensorflow/mac/cpu/tensorflow-1.0.1-py3-none-any.whl
Error note (if you did not get an error skip this paragraph): Depending on how you installed pip and/or conda, we've seen different outcomes. If you get an error the first time, rerunning it may incorrectly show that it installs without error. Try running with pip install --upgrade https://storage.googleapis.com/tensorflow/mac/cpu/tensorflow-1.0.1-py3-none-any.whl --ignore-installed
. The --ignore-installed
flag tells it to reinstall the package. If that still doesn't work, please open an issue, or you can try to follow the advice here.
Run unittests
We have included example unittests for the tf_train_ctc.py
script
python $RNN_TUTORIAL/src/tests/train_framework/tf_train_ctc_test.py
Run RNN training
All configurations for the RNN training script can be found in $RNN_TUTORIAL/configs/neural_network.ini
python $RNN_TUTORIAL/src/train_framework/tf_train_ctc.py
NOTE: If you have a GPU available, the code will run faster if you set tf_device = /gpu:0
in configs/neural_network.ini
TensorBoard configuration
To visualize your results via tensorboard:
tensorboard --logdir=$RNN_TUTORIAL/models/nn/debug_models/summary/
- TensorBoard can be found in your browser at http://localhost:6006.
tf.name_scope
is used to define parts of the network for visualization in TensorBoard. TensorBoard automatically finds any similarly structured network parts, such as identical fully connected layers and groups them in the graph visualization.- Related to this are the
tf.summary.* methods
that log values of network parts, such as distributions of layer activations or error rate across epochs. These summaries are grouped within thetf.name_scope
. - See the official TensorFlow documentation for more details.
Add data
We have included example data from the LibriVox corpus in data/raw/librivox/LibriSpeech/
. The data is separated into folders:
- Train: train-clean-100-wav (5 examples)
- Test: test-clean-wav (2 examples)
- Dev: dev-clean-wav (2 examples)
If you would like to train a performant model, you can add additional wave and txt files to these folders, or create a new folder and update configs/neural_network.ini
with the folder locations
Remove additions
We made a few additions to your .profile
-- remove those additions if you want, or if you want to keep the system variables, add it to your .bash_profile
by running:
echo "source ~/.profile" >> .bash_profile
Next steps
We hope that our provided repo is a useful resource for getting started. Please share your experiences with adopting RNNs by contacting us or putting in pull requests for suggested changes. To stay in touch, sign up for our newsletter.