/tensorlayer

TensorLayer: Deep learning and Reinforcement learning library for TensorFlow.

Primary LanguagePythonOtherNOASSERTION

TensorLayer: Deep learning and Reinforcement learning library for TensorFlow.

TensorLayer is a transparent Deep Learning and Reinforcement Learning library built on the top of Google TensorFlow. It was designed to provide a higher-level API to TensorFlow in order to speed-up experimentations. TensorLayer is easy to extended and modified, suitable for both machine learning researches and applications. Welcome contribution!

TensorLayer features include:

  • Fast prototyping through highly modular built-in neural network layers, pre-train metrices, regularizers, optimizers, cost functions...
  • Implemented by straightforward code, easy to modify and extend by yourself...
  • Other wraping libraries for TensorFlow are easy to merged into TensorLayer, suitable for machine learning researches...
  • Many official examples covering Dropout, DropConnect, DropNeuron, Denoising Autoencoder, LSTM, ResNet... are given, suitable for machine learning applications...
  • The running speed would not decrease compare with pure TensorFlow codes...

Now, go through the Overview to see how powerful it is !!!

Table of Contents

  1. Library Structure
  2. Overview
  3. Easy to Modify
  4. Installation
  5. Ways to Contribute
  6. Online Documentation
  7. Download Documentation

--

Library Structure

<folder>
├── tensorlayer  		<--- library source code
│
├── setup.py			<--- use ‘python setup.py install’ or ‘pip install . -e‘, to install
├── docs 				<--- readthedocs folder
│   └── _build
│   	└──html
│			└──index.html <--- homepage of the documentation
├── tutorials_*.py	 	<--- tutorials include NLP, DL,RL etc.
├── .. 

--

Overview

More examples about Deep Learning, Reinforcement Learning and Nature Language Processing available on readthedocs, you can also download the docs file read it locally.

  1. Fully Connected Network
  2. Convolutional Neural Network
  3. Recurrent Neural Network
  4. Reinforcement Learning
  5. Cost Function

Fully Connected Network

TensorLayer provides large amount of state-of-the-art Layers including Dropout, DropConnect, ResNet, Pre-train and so on.

Placeholder:

All placeholder and variables can be initialized by the same way with Tensorflow's tutorial. For details please read tensorflow-placeholder, tensorflow-variables and tensorflow-math.

# For MNIST example, 28*28 images have 784 pixels, i.e, 784 inputs.
import tensorflow as tf
import tensorlayer as tl
x = tf.placeholder(tf.float32, shape=[None, 784], name='x')
y_ = tf.placeholder(tf.int64, shape=[None, ], name='y_')

Rectifying Network with Dropout:

# Define the network
network = tl.layers.InputLayer(x, name='input_layer')
network = tl.layers.DropoutLayer(network, keep=0.8, name='drop1')
network = tl.layers.DenseLayer(network, n_units=800, act = tf.nn.relu, name='relu1')
network = tl.layers.DropoutLayer(network, keep=0.5, name='drop2')
network = tl.layers.DenseLayer(network, n_units=800, act = tf.nn.relu, name='relu2')
network = tl.layers.DropoutLayer(network, keep=0.5, name='drop3')
network = tl.layers.DenseLayer(network, n_units=10, act = identity, name='output_layer')
# Start training
...

Vanilla Sparse Autoencoder:

# Define the network
network = tl.layers.InputLayer(x, name='input_layer')
network = tl.layers.DenseLayer(network, n_units=196, act = tf.nn.sigmoid, name='sigmoid1')
recon_layer1 = tl.layers.ReconLayer(network, x_recon=x, n_units=784, act = tf.nn.sigmoid, name='recon_layer1')
# Start pre-train
sess.run(tf.initialize_all_variables())
recon_layer1.pretrain(sess, x=x, X_train=X_train, X_val=X_val, denoise_name=None, n_epoch=200, batch_size=128, print_freq=10, save=True, save_name='w1pre_')
...

Denoising Autoencoder:

# Define the network
network = tl.layers.InputLayer(x, name='input_layer')
network = tl.layers.DropoutLayer(network, keep=0.5, name='denoising1')   
network = tl.layers.DenseLayer(network, n_units=196, act = tf.nn.relu, name='relu1')
recon_layer1 = tl.layers.ReconLayer(network, x_recon=x, n_units=784, act = tf.nn.softplus, name='recon_layer1')
# Start pre-train
sess.run(tf.initialize_all_variables())
recon_layer1.pretrain(sess, x=x, X_train=X_train, X_val=X_val, denoise_name='denoising1', n_epoch=200, batch_size=128, print_freq=10, save=True, save_name='w1pre_')
...

Stacked Denoising Autoencoders:

# Define the network
network = tl.layers.InputLayer(x, name='input_layer')
# denoise layer for Autoencoders
network = tl.layers.DropoutLayer(network, keep=0.5, name='denoising1')
# 1st layer
network = tl.layers.DropoutLayer(network, keep=0.8, name='drop1')
network = tl.layers.DenseLayer(network, n_units=800, act = tf.nn.relu, name='relu1')
x_recon1 = network.outputs
recon_layer1 = tl.layers.ReconLayer(network, x_recon=x, n_units=784, act = tf.nn.softplus, name='recon_layer1')
# 2nd layer
network = tl.layers.DropoutLayer(network, keep=0.5, name='drop2')
network = tl.layers.DenseLayer(network, n_units=800, act = tf.nn.relu, name='relu2')
recon_layer2 = tl.layers.ReconLayer(network, x_recon=x_recon1, n_units=800, act = tf.nn.softplus, name='recon_layer2')
# 3rd layer
network = tl.layers.DropoutLayer(network, keep=0.5, name='drop3')
network = tl.layers.DenseLayer(network, n_units=10, act = identity, name='output_layer')

sess.run(tf.initialize_all_variables())

# Print all parameters before pre-train
network.print_params()

# Pre-train Layer 1
recon_layer1.pretrain(sess, x=x, X_train=X_train, X_val=X_val, denoise_name='denoising1', n_epoch=100, batch_size=128, print_freq=10, save=True, save_name='w1pre_')
# Pre-train Layer 2
recon_layer2.pretrain(sess, x=x, X_train=X_train, X_val=X_val, denoise_name='denoising1', n_epoch=100, batch_size=128, print_freq=10, save=False)
# Start training
...

Convolutional Neural Network

Instead of feeding the images as 1D vectors, the images can be imported as 4D matrix, where [None, 28, 28, 1] represents [batchsize, height, width, channels]. Set 'batchsize' to 'None' means data with different batchsize can all filled into the placeholder.

x = tf.placeholder(tf.float32, shape=[None, 28, 28, 1])
y_ = tf.placeholder(tf.int64, shape=[None,])

CNNs + MLP:

A 2 layers CNN followed by 2 fully connected layers can be defined by the following codes:

network = tl.layers.InputLayer(x, name='input_layer')
network = tl.layers.Conv2dLayer(network,
                        act = tf.nn.relu,
                        shape = [5, 5, 1, 32],  # 32 features for each 5x5 patch
                        strides=[1, 1, 1, 1],
                        padding='SAME',
                        name ='cnn_layer1')     # output: (?, 28, 28, 32)
network = tl.layers.Pool2dLayer(network,
                        ksize=[1, 2, 2, 1],
                        strides=[1, 2, 2, 1],
                        padding='SAME',
                        pool = tf.nn.max_pool,
                        name ='pool_layer1',)   # output: (?, 14, 14, 32)
network = tl.layers.Conv2dLayer(network,
                        act = tf.nn.relu,
                        shape = [5, 5, 32, 64], # 64 features for each 5x5 patch
                        strides=[1, 1, 1, 1],
                        padding='SAME',
                        name ='cnn_layer2')     # output: (?, 14, 14, 64)
network = tl.layers.Pool2dLayer(network,
                        ksize=[1, 2, 2, 1],
                        strides=[1, 2, 2, 1],
                        padding='SAME',
                        pool = tf.nn.max_pool,
                        name ='pool_layer2',)   # output: (?, 7, 7, 64)
network = tl.layers.FlattenLayer(network, name='flatten_layer')                                # output: (?, 3136)
network = tl.layers.DropoutLayer(network, keep=0.5, name='drop1')                              # output: (?, 3136)
network = tl.layers.DenseLayer(network, n_units=256, act = tf.nn.relu, name='relu1')           # output: (?, 256)
network = tl.layers.DropoutLayer(network, keep=0.5, name='drop2')                              # output: (?, 256)
network = tl.layers.DenseLayer(network, n_units=10, act = tl.identity, name='output_layer')    # output: (?, 10)

For more powerful function, please go to readthedocs.

Recurrent Neural Network

LSTM:

For more powerful function, please go to readthedocs.

Reinforcement Learning

To understand Reinforcement Learning, a Blog (Deep Reinforcement Learning: Pong from Pixels) and a Paper (Playing Atari with Deep Reinforcement Learning) are recommended. To play with RL, use OpenAI Gym as benchmark is recommended.

Pong Game:

Atari Pong Game is a single agent example. Pong from Pixels using 130 lines of Python only (Code link) can be reimplemented as follow.

# Policy network
network = tl.layers.InputLayer(x, name='input_layer')
network = tl.layers.DenseLayer(network, n_units= H , act = tf.nn.relu, name='relu_layer')
network = tl.layers.DenseLayer(network, n_units= 1 , act = tf.nn.sigmoid, name='output_layer')

Cost Function

TensorLayer provides a simple way to creat you own cost function. Take a MLP below for example.

network = tl.InputLayer(x, name='input_layer')
network = tl.DropoutLayer(network, keep=0.8, name='drop1')
network = tl.DenseLayer(network, n_units=800, act = tf.nn.relu, name='relu1')
network = tl.DropoutLayer(network, keep=0.5, name='drop2')
network = tl.DenseLayer(network, n_units=800, act = tf.nn.relu, name='relu2')
network = tl.DropoutLayer(network, keep=0.5, name='drop3')
network = tl.DenseLayer(network, n_units=10, act = identity, name='output_layer')

Regularization of Weights:

After initializing the variables, the informations of network parameters can be observed by using network.print_params().

sess.run(tf.initialize_all_variables())
network.print_params()
>> param 0: (784, 800) (mean: -0.000000, median: 0.000004 std: 0.035524)
>> param 1: (800,) (mean: 0.000000, median: 0.000000 std: 0.000000)
>> param 2: (800, 800) (mean: 0.000029, median: 0.000031 std: 0.035378)
>> param 3: (800,) (mean: 0.000000, median: 0.000000 std: 0.000000)
>> param 4: (800, 10) (mean: 0.000673, median: 0.000763 std: 0.049373)
>> param 5: (10,) (mean: 0.000000, median: 0.000000 std: 0.000000)
>> num of params: 1276810

The output of network is network.outputs, then the cross entropy can be defined as follow. Besides, to regularize the weights, the network.all_params contains all parameters of the network. In this case, network.all_params = [W1, b1, W2, b2, Wout, bout] according to param 0, 1 ... 5 shown by network.print_params(). Then max-norm regularization on W1 and W2 can be performed as follow.

y = network.outputs
cross_entropy = tf.reduce_mean(tf.nn.sparse_softmax_cross_entropy_with_logits(y, y_))
cost = cross_entropy
cost = cost + maxnorm_regularizer(1.0)(network.all_params[0]) + maxnorm_regularizer(1.0)(network.all_params[2])

Regularization of Activation Outputs:

Instance method network.print_layers() prints all outputs of different layers in order. To achieve regularization on activation output, you can use network.all_layers which contains all outputs of different layers. If you want to use L1 penalty on the activations of first hidden layer, just simply add tf.contrib.layers.l2_regularizer(lambda_l1)(network.all_layers[1]) to the cost function.

network.print_layers()
>> layer 0: Tensor("dropout/mul_1:0", shape=(?, 784), dtype=float32)
>> layer 1: Tensor("Relu:0", shape=(?, 800), dtype=float32)
>> layer 2: Tensor("dropout_1/mul_1:0", shape=(?, 800), dtype=float32)
>> layer 3: Tensor("Relu_1:0", shape=(?, 800), dtype=float32)
>> layer 4: Tensor("dropout_2/mul_1:0", shape=(?, 800), dtype=float32)
>> layer 5: Tensor("add_2:0", shape=(?, 10), dtype=float32)

For more powerful function, please go to readthedocs.

Easy to Modify

Modifying Pre-train Behaviour:

Greedy layer-wise pretrain is an important task for deep neural network initialization, while there are many kinds of pre-train metrices according to different architectures and applications.

For example, the pre-train process of Vanilla Sparse Autoencoder can be implemented by using KL divergence as the following code, but for Deep Rectifier Network, the sparsity can be implemented by using the L1 regularization of activation output.

# Vanilla Sparse Autoencoder
beta = 4
rho = 0.15
p_hat = tf.reduce_mean(activation_out, reduction_indices = 0)
KLD = beta * tf.reduce_sum( rho * tf.log(tf.div(rho, p_hat)) + (1- rho) * tf.log((1- rho)/ (tf.sub(float(1), p_hat))) )

For this reason, TensorLayer provides a simple way to modify or design your own pre-train metrice. For Autoencoder, TensorLayer uses ReconLayer.**init() to define the reconstruction layer and cost function, to define your own cost function, just simply modify the self.cost in ReconLayer.**init(). To creat your own cost expression please read Tensorflow Math. By default, ReconLayer only updates the weights and biases of previous 1 layer by using self.train_params = self.all _params[-4:], where the 4 parameters are [W_encoder, b_encoder, W_decoder, b_decoder]. If you want to update the parameters of previous 2 layers, simply modify [-4:] to [-6:].

ReconLayer.__init__(...):
    ...
    self.train_params = self.all_params[-4:]
    ...
	self.cost = mse + L1_a + L2_w

Adding Customized Regularizer:

Installation

TensorFlow Installation:

This library requires Tensorflow (version >= 0.8) to be installed: Tensorflow installation instructions.

GPU Setup:

GPU-version of Tensorflow requires CUDA and CuDNN to be installed.

CUDA, CuDNN installation instructions.

CUDA download.

CuDNN download.

TensorLayer Installation:

pip install git+https://github.com/xxx/xxx.git

Otherwise, you can also install from source by running (from source folder):

python setup.py install

Ways to Contribute

TensorLayer begins as an internal repository at Imperial College Lodnon, helping researchers to test their new methods. It now encourage researches from all over the world to publish their new methods so as to promote the development of machine learning.

Your method can be merged into TensorLayer, if you can prove your method xxx

Test script with detailed descriptions is required

Online Documentation

The documentation is placed in Read-the-Docs.

Download Documentation

Alternatively you can download the documentation via Read-the-Docs as well.