Problem with initialization
266918 opened this issue · 5 comments
Describe the bug
I have run the sequential model multiple times and it produces inconsistent results.
Model: "sequential"
Layer (type) Output Shape Param #
tcn (TCN) (None, 10) 2940
dense (Dense) (None, 1) 11
Total params: 2,951
Trainable params: 2,951
Non-trainable params: 0
Epoch 1/200
50/50 [==============================] - 3s 22ms/step - loss: 0.0433 - mae: 0.0433 - val_loss: 0.0069 - val_mae: 0.0069
Epoch 2/200
50/50 [==============================] - 0s 6ms/step - loss: 0.0220 - mae: 0.0220 - val_loss: 0.0094 - val_mae: 0.0094
The second run - I didn't change any parameter, just rerun the model training again, it got the following result.
Model: "sequential_1"
Layer (type) Output Shape Param #
tcn_1 (TCN) (None, 10) 2940
dense_1 (Dense) (None, 1) 11
Total params: 2,951
Trainable params: 2,951
Non-trainable params: 0
Paste a snippet
Here's the code. the data is amazon stock price downloaded as csv format. Every time, build_model() is called.
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
from sklearn.preprocessing import MinMaxScaler
from tcn.tcn import TCN
from tensorflow import keras
window_size = 20 #
batch_size = 32 #
epochs = 200 #
filter_nums = 10 #
kernel_size = 4 #
def get_dataset():
df = pd.read_csv('./bars/AMZN Historical Data.csv', thousands=',')
df = df[::-1]
scaler = MinMaxScaler()
open_arr = scaler.fit_transform(df['Open'].values.reshape(-1, 1)).reshape(-1)
X = np.zeros(shape=(len(open_arr) - window_size, window_size))
label = np.zeros(shape=(len(open_arr) - window_size))
for i in range(len(open_arr) - window_size):
X[i, :] = open_arr[i:i+window_size]
label[i] = open_arr[i+window_size]
train_X = X[:2000, :]
train_label = label[:2000]
test_X = X[2000:3000, :]
test_label = label[2000:3000]
return train_X, train_label, test_X, test_label, scaler
def RMSE(pred, true):
return np.mean(np.sqrt(np.square(pred - true)))
def plot(pred, true):
fig = plt.figure()
ax = fig.add_subplot(111)
ax.plot(range(len(pred)), pred)
ax.plot(range(len(true)), true)
plt.show()
def build_model():
train_X, train_label, test_X, test_label, scaler = get_dataset()
model = keras.models.Sequential([
keras.layers.Input(shape=(window_size, 1)),
TCN(nb_filters=filter_nums,
kernel_size=kernel_size,
dilations=[1, 2, 4, 8]),
keras.layers.Dense(units=1, activation='relu')
])
model.summary()
model.compile(optimizer='adam', loss='mae', metrics=['mae'])
model.fit(train_X, train_label, validation_split=0.2, epochs=epochs)
model.evaluate(test_X, test_label)
prediction = model.predict(test_X)
scaled_prediction = scaler.inverse_transform(prediction.reshape(-1, 1)).reshape(-1)
scaled_test_label = scaler.inverse_transform(test_label.reshape(-1, 1)).reshape(-1)
print('RMSE ', RMSE(scaled_prediction, scaled_test_label))
plot(scaled_prediction, scaled_test_label)
Dependencies
Tensorflow version
tensorflow 2.4.1
tensorflow-addons 0.12.1
tensorflow-estimator 2.4.0
@266918 can you paste a full snippet that I can execute on my machine?
@266918 can you upload this ./bars/AMZN Historical Data.csv
too?
it's in the same dir...https://github.com/266918/TCN-AMAZON-Test/blob/main/AMZN%20Historical%20Data.csv
@philipperemy
@266918 that's an initialization problem. NNs have random initialization and sometimes it can get stuck at the beginning. So in your case, you can try something like this:
use_skip_connections=True,
kernel_initializer='glorot_uniform'
It will help the gradients flow more smoothly.
And for your last layer you can remove the relu as well:
keras.layers.Dense(units=1, activation='linear')
I know the values you want to predict are positive. Your model will learn it by itself.