philipperemy/keras-tcn

Different version of Keras-TCN/TF combo produces different parameter count in TCN with same hyperparameters

swapnilsayansaha opened this issue · 2 comments

Code snippet (Python 3.8.8):

from tcn import TCN
from tensorflow.keras.layers import Dense, MaxPooling1D, Flatten
from tensorflow.keras import Input, Model
import tensorflow as tf 

batch_size, timesteps, input_dim = 256, 200, 6
i = Input(shape=(timesteps, input_dim))
m = TCN(nb_filters=32,dilations=[1, 2, 4, 8, 16, 32, 64, 128])(i)
m = tf.reshape(m, [-1, 32, 1])

m = MaxPooling1D(pool_size=(2))(m)
m = Flatten()(m)
output1 = Dense(1, activation='linear', name='vel')(m)
output2 = Dense(1, activation='linear', name='head')(m)

model = Model(inputs=[i], outputs=[output1, output2])
model.summary()

When I run the above snippet using keras-tcn 3.3.0 and tensorflow 2.4.0 (gpu), I get the following network:


Model: "model"
__________________________________________________________________________________________________
Layer (type)                    Output Shape         Param #     Connected to                     
==================================================================================================
input_1 (InputLayer)            [(None, 200, 6)]     0                                            
__________________________________________________________________________________________________
tcn (TCN)                       (None, 32)           31840       input_1[0][0]                    
__________________________________________________________________________________________________
tf.reshape (TFOpLambda)         (None, 32, 1)        0           tcn[0][0]                        
__________________________________________________________________________________________________
max_pooling1d (MaxPooling1D)    (None, 16, 1)        0           tf.reshape[0][0]                 
__________________________________________________________________________________________________
flatten (Flatten)               (None, 16)           0           max_pooling1d[0][0]              
__________________________________________________________________________________________________
vel (Dense)                     (None, 1)            17          flatten[0][0]                    
__________________________________________________________________________________________________
head (Dense)                    (None, 1)            17          flatten[0][0]                    
==================================================================================================
Total params: 31,874
Trainable params: 31,874
Non-trainable params: 0

However, when I use keras-tcn 3.4.0 and tensorflow 2.4.1 (gpu), I get the following network:

WARNING:tensorflow:AutoGraph could not transform <bound method TCN.call of <tcn.tcn.TCN object at 0x7fc45435d550>> and will run it as-is.
Please report this to the TensorFlow team. When filing the bug, set the verbosity to 10 (on Linux, `export AUTOGRAPH_VERBOSITY=10`) and attach the full output.
Cause: module 'gast' has no attribute 'Index'
To silence this warning, decorate the function with @tf.autograph.experimental.do_not_convert
WARNING: AutoGraph could not transform <bound method TCN.call of <tcn.tcn.TCN object at 0x7fc45435d550>> and will run it as-is.
Please report this to the TensorFlow team. When filing the bug, set the verbosity to 10 (on Linux, `export AUTOGRAPH_VERBOSITY=10`) and attach the full output.
Cause: module 'gast' has no attribute 'Index'
To silence this warning, decorate the function with @tf.autograph.experimental.do_not_convert
Model: "model"
__________________________________________________________________________________________________
Layer (type)                    Output Shape         Param #     Connected to                     
==================================================================================================
input_1 (InputLayer)            [(None, 200, 6)]     0                                            
__________________________________________________________________________________________________
tcn (TCN)                       (None, 32)           47392       input_1[0][0]                    
__________________________________________________________________________________________________
tf.reshape (TFOpLambda)         (None, 32, 1)        0           tcn[0][0]                        
__________________________________________________________________________________________________
max_pooling1d (MaxPooling1D)    (None, 16, 1)        0           tf.reshape[0][0]                 
__________________________________________________________________________________________________
flatten (Flatten)               (None, 16)           0           max_pooling1d[0][0]              
__________________________________________________________________________________________________
vel (Dense)                    (None, 1)            17          flatten[0][0]                    
__________________________________________________________________________________________________
head  (Dense)                    (None, 1)            17          flatten[0][0]                    
==================================================================================================
Total params: 47,426
Trainable params: 47,426
Non-trainable params: 0

Can someone suggest what's wrong? Why will the same network suddenly see a jump in size by almost 16k parameters for different package versions?

Also note that the TCN with keras-tcn 3.4.0 and tensorflow 2.4.1 (gpu) does not learn anything and just outputs noise, whereas the TCN with keras-tcn 3.3.0 and tensorflow 2.4.0 (gpu) works fine.

@swapnilsayansaha sorry for the late reply. Two default parameters have changed from 3.3.0 to 3.4.0: kernel_size and use_skip_connections.

For your problem it seems better to use a smaller kernel size (2), without skip connections.

Change from 3.3.0 to 3.4.0

image

3.4.0 with the same number of weights as 3.3.0

import tensorflow as tf
from tensorflow.keras import Input, Model
from tensorflow.keras.layers import Dense, MaxPooling1D, Flatten

from tcn import TCN

batch_size, timesteps, input_dim = 256, 200, 6
i = Input(shape=(timesteps, input_dim))
m = TCN(kernel_size=2, use_skip_connections=False, nb_filters=32, dilations=[1, 2, 4, 8, 16, 32, 64, 128])(i)
m = tf.reshape(m, [-1, 32, 1])

m = MaxPooling1D(pool_size=(2))(m)
m = Flatten()(m)
output1 = Dense(1, activation='linear', name='vel')(m)
output2 = Dense(1, activation='linear', name='head')(m)

model = Model(inputs=[i], outputs=[output1, output2])
model.summary()

Output

Model: "model"
__________________________________________________________________________________________________
 Layer (type)                   Output Shape         Param #     Connected to                     
==================================================================================================
 input_1 (InputLayer)           [(None, 200, 6)]     0           []                               
                                                                                                  
 tcn (TCN)                      (None, 32)           31840       ['input_1[0][0]']                
                                                                                                  
 tf.reshape (TFOpLambda)        (None, 32, 1)        0           ['tcn[0][0]']                    
                                                                                                  
 max_pooling1d (MaxPooling1D)   (None, 16, 1)        0           ['tf.reshape[0][0]']             
                                                                                                  
 flatten (Flatten)              (None, 16)           0           ['max_pooling1d[0][0]']          
                                                                                                  
 vel (Dense)                    (None, 1)            17          ['flatten[0][0]']                
                                                                                                  
 head (Dense)                   (None, 1)            17          ['flatten[0][0]']                
                                                                                                  
==================================================================================================
Total params: 31,874
Trainable params: 31,874
Non-trainable params: 0
__________________________________________________________________________________________________