The parameter of the TCN

Question

The parameter of the TCN

yanghui-wng opened this issue 2 years ago · 7 comments

I am studying the code about TCN on github (https://github.com/philipperemy/keras-tcn). The number of parameters of the TCN I calculated is different from the answer of the function "model.summary()". The parameter of TCN layer that calculated by the function "model.summary()" is 153500, but I am not clear about how to calculate this value and I am trying to calculate the value, but the result is 153000.

Code

# design network
batch_size = None
model = Sequential()
input_layer = Input(batch_shape=(batch_size,1,7))
model.add(input_layer)
model.add(TCN(nb_filters=100, #Integer. The number of filters to use in the convolutional layers. Would be similar to units for LSTM. Can be a list.
        kernel_size=3, #Integer. The size of the kernel to use in each convolutional layer.
        nb_stacks=1,   #The number of stacks of residual blocks to use.
        dilations=(1,2,4), #List/Tuple. A dilation list. Example is: [1, 2, 4, 8, 16, 32, 64].
        padding='causal',
        use_skip_connections=False, 
        dropout_rate=0.1,
        return_sequences=False,
        activation='relu', 
        kernel_initializer='he_normal', 
        use_batch_norm=False, 
        use_layer_norm=False, 
        ))
model.add(Dense(64))
model.add(LeakyReLU(alpha=0.3))
model.add(Dense(32))
model.add(LeakyReLU(alpha=0.3))
model.add(Dense(16))
model.add(LeakyReLU(alpha=0.3))
model.add(Dense(1))
model.add(LeakyReLU(alpha=0.3))
model.compile(loss='mse', optimizer='adam')
model.summary()

Output

Model: "sequential_2"
_________________________________________________________________
Layer (type)                 Output Shape              Param #   
=================================================================
tcn_2 (TCN)                  (None, 100)               153500    
_________________________________________________________________
dense_8 (Dense)              (None, 64)                6464      
_________________________________________________________________
leaky_re_lu_8 (LeakyReLU)    (None, 64)                0         
_________________________________________________________________
dense_9 (Dense)              (None, 32)                2080      
_________________________________________________________________
leaky_re_lu_9 (LeakyReLU)    (None, 32)                0         
_________________________________________________________________
dense_10 (Dense)             (None, 16)                528       
_________________________________________________________________
leaky_re_lu_10 (LeakyReLU)   (None, 16)                0         
_________________________________________________________________
dense_11 (Dense)             (None, 1)                 17        
_________________________________________________________________
leaky_re_lu_11 (LeakyReLU)   (None, 1)                 0         
=================================================================
Total params: 162,589
Trainable params: 162,589
Non-trainable params: 0
_________________________________________________________________

Answer 1 · 2022-09-25T14:56:31.000Z

@yanghui-wng here is a detailed version of the weights contained in the TCN model:

Model: "sequential"
_________________________________________________________________
Layer (type)                 Output Shape              Param #
=================================================================
matching_conv1D (Conv1D)     multiple                  800
_________________________________________________________________
Act_Res_Block (Activation)   multiple                  0
_________________________________________________________________
conv1D_0 (Conv1D)            multiple                  2200
_________________________________________________________________
Act_Conv1D_0 (Activation)    multiple                  0
_________________________________________________________________
SDropout_0 (SpatialDropout1D multiple                  0
_________________________________________________________________
conv1D_1 (Conv1D)            multiple                  30100
_________________________________________________________________
Act_Conv1D_1 (Activation)    multiple                  0
_________________________________________________________________
SDropout_1 (SpatialDropout1D multiple                  0
_________________________________________________________________
Act_Conv_Blocks (Activation) multiple                  0
_________________________________________________________________
matching_identity (Lambda)   (None, 1, 100)            0
_________________________________________________________________
Act_Res_Block (Activation)   multiple                  0
_________________________________________________________________
conv1D_0 (Conv1D)            multiple                  30100
_________________________________________________________________
Act_Conv1D_0 (Activation)    multiple                  0
_________________________________________________________________
SDropout_0 (SpatialDropout1D multiple                  0
_________________________________________________________________
conv1D_1 (Conv1D)            multiple                  30100
_________________________________________________________________
Act_Conv1D_1 (Activation)    multiple                  0
_________________________________________________________________
SDropout_1 (SpatialDropout1D multiple                  0
_________________________________________________________________
Act_Conv_Blocks (Activation) multiple                  0
_________________________________________________________________
matching_identity (Lambda)   (None, 1, 100)            0
_________________________________________________________________
Act_Res_Block (Activation)   multiple                  0
_________________________________________________________________
conv1D_0 (Conv1D)            multiple                  30100
_________________________________________________________________
Act_Conv1D_0 (Activation)    multiple                  0
_________________________________________________________________
SDropout_0 (SpatialDropout1D multiple                  0
_________________________________________________________________
conv1D_1 (Conv1D)            multiple                  30100
_________________________________________________________________
Act_Conv1D_1 (Activation)    multiple                  0
_________________________________________________________________
SDropout_1 (SpatialDropout1D multiple                  0
_________________________________________________________________
Act_Conv_Blocks (Activation) multiple                  0
_________________________________________________________________
Slice_Output (Lambda)        multiple                  0
_________________________________________________________________
dense (Dense)                (None, 64)                6464
_________________________________________________________________
leaky_re_lu (LeakyReLU)      (None, 64)                0
_________________________________________________________________
dense_1 (Dense)              (None, 32)                2080
_________________________________________________________________
leaky_re_lu_1 (LeakyReLU)    (None, 32)                0
_________________________________________________________________
dense_2 (Dense)              (None, 16)                528
_________________________________________________________________
leaky_re_lu_2 (LeakyReLU)    (None, 16)                0
_________________________________________________________________
dense_3 (Dense)              (None, 1)                 17
_________________________________________________________________
leaky_re_lu_3 (LeakyReLU)    (None, 1)                 0
=================================================================
Total params: 162,589
Trainable params: 162,589
Non-trainable params: 0
_________________________________________________________________

Answer 2 · 2022-09-25T14:57:27.000Z

And here are the TCN blocks (the breakdown by block):

_________________________________________________________________
Layer (type)                 Output Shape              Param #
=================================================================
residual_block_0 (ResidualBl multiple                  33100
_________________________________________________________________
residual_block_1 (ResidualBl multiple                  60200
_________________________________________________________________
residual_block_2 (ResidualBl multiple                  60200
_________________________________________________________________

Answer 3 · 2022-09-25T15:03:21.000Z

Here is the graph of your model. I used tensorboard for this. You can generate it yourself and explore each node of your model.

Answer 4 · 2022-09-25T15:04:55.000Z

To reproduce it you can run this script:

import numpy as np
from tensorflow.keras import Input
from tensorflow.keras import Sequential
from tensorflow.keras.callbacks import TensorBoard
from tensorflow.keras.layers import Dense
from tensorflow.keras.layers import LeakyReLU

from tcn import TCN

input_dim = 7
timesteps = 1

print('Loading data...')
x_train = np.zeros(shape=(100, timesteps, input_dim))
y_train = np.zeros(shape=(100, 1))

batch_size = None
model = Sequential()
input_layer = Input(batch_shape=(batch_size, timesteps, input_dim))
model.add(input_layer)
model.add(TCN(nb_filters=100,
              # Integer. The number of filters to use in the convolutional layers. Would be similar to units for LSTM. Can be a list.
              kernel_size=3,  # Integer. The size of the kernel to use in each convolutional layer.
              nb_stacks=1,  # The number of stacks of residual blocks to use.
              dilations=(1, 2, 4),  # List/Tuple. A dilation list. Example is: [1, 2, 4, 8, 16, 32, 64].
              padding='causal',
              use_skip_connections=False,
              dropout_rate=0.1,
              return_sequences=False,
              activation='relu',
              kernel_initializer='he_normal',
              use_batch_norm=False,
              use_layer_norm=False,
              ))
model.add(Dense(64))
model.add(LeakyReLU(alpha=0.3))
model.add(Dense(32))
model.add(LeakyReLU(alpha=0.3))
model.add(Dense(16))
model.add(LeakyReLU(alpha=0.3))
model.add(Dense(1))
model.add(LeakyReLU(alpha=0.3))
model.compile(loss='mse', optimizer='adam')

# tensorboard --logdir logs_tcn
# Browse to http://localhost:6006/#graphs&run=train.
# and double click on TCN to expand the inner layers.
# It takes time to write the graph to tensorboard. Wait until the first epoch is completed.
tensorboard = TensorBoard(
    log_dir='logs_tcn',
    histogram_freq=1,
    write_images=True
)

print('Train...')
model.fit(
    x_train, y_train,
    batch_size=batch_size,
    callbacks=[tensorboard],
    epochs=10
)

Run it and a folder called logs_tcn should be generated. In the same directory run:

tensorboard --logdir logs_tcn

And go to http://localhost:6006/.

Select GRAPH and you will see it:

Answer 5 · 2022-09-25T15:05:37.000Z

I guess with all those tools, you should be able to have an answer.

Answer 6 · 2022-09-26T06:12:19.000Z

Thank you for your help! Through your explanation, I have known how to calculate the parameters of TCN.

Answer 7 · 2022-09-26T06:53:54.000Z

Good to hear!