philipperemy/keras-tcn

Problem with initialization

266918 opened this issue · 5 comments

Describe the bug
I have run the sequential model multiple times and it produces inconsistent results.
Model: "sequential"


Layer (type)                 Output Shape              Param #   

tcn (TCN)                    (None, 10)                2940      


dense (Dense)                (None, 1)                 11        

Total params: 2,951
Trainable params: 2,951
Non-trainable params: 0


Epoch 1/200
50/50 [==============================] - 3s 22ms/step - loss: 0.0433 - mae: 0.0433 - val_loss: 0.0069 - val_mae: 0.0069
Epoch 2/200
50/50 [==============================] - 0s 6ms/step - loss: 0.0220 - mae: 0.0220 - val_loss: 0.0094 - val_mae: 0.0094

The result looks good
image

The second run - I didn't change any parameter, just rerun the model training again, it got the following result.
Model: "sequential_1"


Layer (type)                 Output Shape              Param #   

tcn_1 (TCN)                  (None, 10)                2940      


dense_1 (Dense)              (None, 1)                 11        

Total params: 2,951
Trainable params: 2,951
Non-trainable params: 0
image

Paste a snippet
Here's the code. the data is amazon stock price downloaded as csv format. Every time, build_model() is called.
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
from sklearn.preprocessing import MinMaxScaler
from tcn.tcn import TCN
from tensorflow import keras

window_size = 20 #
batch_size = 32 #
epochs = 200 #
filter_nums = 10 #
kernel_size = 4 #

def get_dataset():
df = pd.read_csv('./bars/AMZN Historical Data.csv', thousands=',')

df = df[::-1]

scaler = MinMaxScaler()
open_arr = scaler.fit_transform(df['Open'].values.reshape(-1, 1)).reshape(-1)
X = np.zeros(shape=(len(open_arr) - window_size, window_size))
label = np.zeros(shape=(len(open_arr) - window_size))
for i in range(len(open_arr) - window_size):
    X[i, :] = open_arr[i:i+window_size]
    label[i] = open_arr[i+window_size]
train_X = X[:2000, :]
train_label = label[:2000]
test_X = X[2000:3000, :]
test_label = label[2000:3000]
return train_X, train_label, test_X, test_label, scaler

def RMSE(pred, true):
return np.mean(np.sqrt(np.square(pred - true)))

def plot(pred, true):
fig = plt.figure()
ax = fig.add_subplot(111)
ax.plot(range(len(pred)), pred)
ax.plot(range(len(true)), true)
plt.show()

def build_model():
train_X, train_label, test_X, test_label, scaler = get_dataset()
model = keras.models.Sequential([
keras.layers.Input(shape=(window_size, 1)),
TCN(nb_filters=filter_nums,
kernel_size=kernel_size,
dilations=[1, 2, 4, 8]),
keras.layers.Dense(units=1, activation='relu')
])
model.summary()
model.compile(optimizer='adam', loss='mae', metrics=['mae'])
model.fit(train_X, train_label, validation_split=0.2, epochs=epochs)

model.evaluate(test_X, test_label)
prediction = model.predict(test_X)
scaled_prediction = scaler.inverse_transform(prediction.reshape(-1, 1)).reshape(-1)
scaled_test_label = scaler.inverse_transform(test_label.reshape(-1, 1)).reshape(-1)
print('RMSE ', RMSE(scaled_prediction, scaled_test_label))
plot(scaled_prediction, scaled_test_label)

Dependencies
Tensorflow version
tensorflow 2.4.1
tensorflow-addons 0.12.1
tensorflow-estimator 2.4.0

@266918 can you paste a full snippet that I can execute on my machine?

@266918 can you upload this ./bars/AMZN Historical Data.csv too?

@266918 that's an initialization problem. NNs have random initialization and sometimes it can get stuck at the beginning. So in your case, you can try something like this:

use_skip_connections=True,
kernel_initializer='glorot_uniform'

It will help the gradients flow more smoothly.

And for your last layer you can remove the relu as well:

keras.layers.Dense(units=1, activation='linear')

I know the values you want to predict are positive. Your model will learn it by itself.