How to use TensorLayer

While research in Deep Learning continues to improve the world, we use a bunch of tricks to implement algorithms with TensorLayer day to day.

Here are a summary of the tricks to use TensorLayer, you can also find more tricks in FQA.

If you find a trick that is particularly useful in practice, please open a Pull Request to add it to the document. If we find it to be reasonable and verified, we will merge it in.

🇨🇳 《深度学习：一起玩转TensorLayer》已上架。

1. Installation

To keep your TL version and edit the source code easily, you can download the whole repository by excuting git clone https://github.com/zsdonghao/tensorlayer.git in your terminal, then copy the tensorlayer folder into your project
As TL is growing very fast, if you want to use pip install, we suggest you to install the master version
For NLP application, you will need to install NLTK and NLTK data

2. Interaction between TF and TL

TF to TL : use InputLayer
TL to TF : use network.outputs
Other methods issues7, multiple inputs issues31

3. Training/Testing switching

Use network.all_drop to control the training/testing phase (for DropoutLayer only) see tutorial_mnist.py and Understand Basic layer
Alternatively, set is_fix to True in DropoutLayer, and build different graphs for training/testing by reusing the parameters. You can also set different batch_size and noise probability for different graphs. This method is the best when you use GaussianNoiseLayer, BatchNormLayer and etc. Here is an example:

def mlp(x, is_train=True, reuse=False):
    with tf.variable_scope("MLP", reuse=reuse):
      tl.layers.set_name_reuse(reuse)
      net = InputLayer(x, name='in')
      net = DropoutLayer(net, 0.8, True, is_train, name='drop1')
      net = DenseLayer(net, n_units=800, act=tf.nn.relu, name='dense1')
      net = DropoutLayer(net, 0.8, True, is_train, name='drop2')
      net = DenseLayer(net, n_units=800, act=tf.nn.relu, name='dense2')
      net = DropoutLayer(net, 0.8, True, is_train, name='drop3')
      net = DenseLayer(net, n_units=10, act=tf.identity, name='out')
      logits = net.outputs
      net.outputs = tf.nn.sigmoid(net.outputs)
      return net, logits
x = tf.placeholder(tf.float32, shape=[None, 784], name='x')
y_ = tf.placeholder(tf.int64, shape=[None, ], name='y_')
net_train, logits = mlp(x, is_train=True, reuse=False)
net_test, _ = mlp(x, is_train=False, reuse=True)
cost = tl.cost.cross_entropy(logits, y_, name='cost')

4. Get variables and outputs

Use tl.layers.get_variables_with_name instead of using net.all_params

train_vars = tl.layers.get_variables_with_name('MLP', True, True)
train_op = tf.train.AdamOptimizer(learning_rate=0.0001).minimize(cost, var_list=train_vars)

This method can also be used to freeze some layers during training, just simply don't get some variables
Other methods issues17, issues26, FQA
Use tl.layers.get_layers_with_name to get list of activation outputs from a network.

layers = tl.layers.get_layers_with_name(network, "MLP", True)

This method usually be used for activation regularization.

5. Pre-trained CNN and Resnet

Pre-trained CNN
Many applications make need pre-trained CNN model
TL examples provide pre-trained VGG16, VGG19, Inception and etc : TL/example
tl.layers.SlimNetsLayer allows you to use all Tf-Slim pre-trained models
Resnet
Implemented by "for" loop issues85
Other methods by @ritchieng

6. Data augmentation

Use TFRecord, see cifar10 and tfrecord examples; good wrapper: imageflow
Use python-threading with tl.prepro.threading_data and the functions for images augmentation see tutorial_image_preprocess.py

7. Batch of data

If your data size is small enough to feed into the memory of your machine.
Use tl.iterate.minibatches to shuffle and return the examples and labels by the given batchsize.
For time-series data, use tl.iterate.seq_minibatches, tl.iterate.seq_minibatches2, tl.iterate.ptb_iterator and etc
If your data size is very large
Use tl.prepro.threading_data to read a batch of data at the beginning of every step
Use TFRecord again, see cifar10 and tfrecord examples

8. Customized layer

1. Write a TL layer directly
1. Use LambdaLayer, it can also accept functions with new variables. With this layer you can connect all third party TF libraries and your customized function to TL. Here is an example of using Keras and TL together.

import tensorflow as tf
import tensorlayer as tl
from keras.layers import *
from tensorlayer.layers import *
def my_fn(x):
    x = Dropout(0.8)(x)
    x = Dense(800, activation='relu')(x)
    x = Dropout(0.5)(x)
    x = Dense(800, activation='relu')(x)
    x = Dropout(0.5)(x)
    logits = Dense(10, activation='linear')(x)
    return logits

network = InputLayer(x, name='input')
network = LambdaLayer(network, my_fn, name='keras')
...

9. Sentences tokenization

Use tl.nlp.process_sentence to tokenize the sentences, NLTK and NLTK data is required

>>> captions = ["one two , three", "four five five"] # 2个 句 子 
>>> processed_capts = []
>>> for c in captions:
>>>    c = tl.nlp.process_sentence(c, start_word="<S>", end_word="</S>")
>>>    processed_capts.append(c)
>>> print(processed_capts)
... [['<S>', 'one', 'two', ',', 'three', '</S>'],
... ['<S>', 'four', 'five', 'five', '</S>']]

Then use tl.nlp.create_vocab to create a vocabulary and save as txt file (it will return a tl.nlp.SimpleVocabulary object for word to id only)

>>> tl.nlp.create_vocab(processed_capts, word_counts_output_file='vocab.txt', min_word_count=1)
... [TL] Creating vocabulary.
... Total words: 8
... Words in vocabulary: 8
... Wrote vocabulary file: vocab.txt

Finally use tl.nlp.Vocabulary to create a vocabulary object from the txt vocabulary file created by tl.nlp.create_vocab

>>> vocab = tl.nlp.Vocabulary('vocab.txt', start_word="<S>", end_word="</S>", unk_word="<UNK>")
... INFO:tensorflow:Initializing vocabulary from file: vocab.txt
... [TL] Vocabulary from vocab.txt : <S> </S> <UNK>
... vocabulary with 10 words (includes start_word, end_word, unk_word)
...   start_id: 2
...   end_id: 3
...   unk_id: 9
...   pad_id: 0

Then you can map word to ID or vice verse as follow:

>>> vocab.id_to_word(2)
... 'one'
>>> vocab.word_to_id('one')
... 2
>>> vocab.id_to_word(100)
... '<UNK>'
>>> vocab.word_to_id('hahahaha')
... 9

More pre-processing functions for sentences in tl.prepro and tl.nlp

10. Dynamic RNN and sequence length

Apply zero padding on a batch of tokenized sentences as follow:

>>> sequences = [[1,1,1,1,1],[2,2,2],[3,3]]
>>> sequences = tl.prepro.pad_sequences(sequences, maxlen=None, 
...         dtype='int32', padding='post', truncating='pre', value=0.)
... [[1 1 1 1 1]
...  [2 2 2 0 0]
...  [3 3 0 0 0]]

Use tl.layers.retrieve_seq_length_op2 to automatically compute the sequence length from placeholder, and feed it to the sequence_length of DynamicRNNLayer

>>> data = [[1,2,0,0,0], [1,2,3,0,0], [1,2,6,1,0]]
>>> o = tl.layers.retrieve_seq_length_op2(data)
>>> sess = tf.InteractiveSession()
>>> tl.layers.initialize_global_variables(sess)
>>> print(o.eval())
... [2 3 4]

Other methods issues18

11. Common problems

Matplotlib issue arise when importing TensorLayer issues, FQA

12. Compatibility with other TF wrappers

TL can interact with other TF wrappers, which means if you find some codes or models implemented by other wrappers, you can just use it !

Keras to TL: KerasLayer (if you find some codes implemented by Keras, just use it. example here)
TF-Slim to TL: SlimNetsLayer (you can use all Google's pre-trained convolutional models with this layer !!!)
I think more libraries will be compatible with TL

13. Compatibility with different TF versions

RNN cell_fn: use tf.contrib.rnn.{cell_fn} for TF1.0+, or tf.nn.rnn_cell.{cell_fn} for TF1.0-
cross_entropy: have to give a unique name for TF1.0+

Useful links

TL official sites: Docs, 中文文档, Github
Learning Deep Learning with TF and TL
Follow zsdonghao for further examples

Author

Zhang Rui
You

zsdonghao/tensorlayer-tricks